问题描述:

I have a document term matrix, with frequencies of >600 words, and a corresponding date (mm/dd/yyyy) for each frequency value:

 > head(mydata3,3)

Claim.Number Note.Date LOSSDATE DATEREPORTED

1 106810 7/10/1998 12/9/1997 12/29/1997

2 106810 7/21/1998 12/9/1997 12/29/1997

3 106810 10/21/1999 12/9/1997 12/29/1997

DATEENTERED Row Topic absenc abus academ access

1 1/5/1998 3 4 0 0 0 0

2 1/5/1998 4 2 0 0 0 0

3 1/5/1998 8 11 0 0 0 0

accid accommod account accus act action activ add

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

addit addl adequ adjust administr admiss advanc

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

advers advic african age agenc agreement aid ambul

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

amount analysi ankl answer anticip appeal appel

1 0 0 0 0 0 0 0

2 0 0 0 0 0 2 0

3 0 0 0 0 0 1 0

appli applic appoint appropri approv approxim arbitr

1 0 0 0 1 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

argu argument aris arm arrang arriv asap assault

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 1 0 0 0 0 0

assert assess assist athlet attach attent audit auto

1 0 0 0 0 0 2 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

avoid await award background balanc ball bar basi

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

benefit big bill black board breach break. brief

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

broken broker budget build bus busi call campus cap

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 2 0 0

3 0 0 0 0 0 0 0 0 0

car care carrier center cgl chair chang charg child

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

children circuit cite citi civil clean client clinic

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

close closur cmc coach code collect commit committe

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

communic compani compar compel compens complain

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

complet conclud condit conduct conf confer confid

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

conflict connect construct consult contact contend

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

contract contractor contribut control convers

1 0 0 0 0 0

2 0 0 0 0 0

3 0 0 0 0 0

convinc cooper coordin copi correct cost counter

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 1 0 0 0

counti cours court cover coverag creat credibl

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

credit crimin cross cut damag danger deadlin deal

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

dean death decis declin deduct defam defect defend

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

degre delay demand deni denial depart depos deposit

1 0 0 0 0 0 0 0 0

2 1 0 0 1 0 0 0 0

3 1 0 0 0 0 0 0 0

dept despit develop diari difficult director disabl

1 0 1 0 1 0 0 0

2 1 0 0 0 0 0 0

3 0 0 0 0 0 0 0

discharg disciplin disciplinari discoveri discrimin

1 0 0 0 0 1

2 0 0 0 0 1

3 0 0 0 0 0

discuss dismiss disput distress district doc docket

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

doctor document done door dorm doubt draft drive

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 1 0

3 0 0 0 0 0 0 0 0

driver drop due earlier earn educ eeoc effort ell

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

els email emot employ employe encourag end endors

1 0 0 0 0 0 0 1 0

2 0 0 0 0 0 0 0 0

3 0 0 0 1 2 0 1 0

enrol entitl environ estim evalu event evid exam

1 0 0 0 0 0 0 0 2

2 0 0 0 0 0 0 0 2

3 0 0 0 0 0 0 0 2

examin exceed excess exchang exclus execut expens

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

experi expert expir exposur extend extens extent

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

extrem eye face facil faculti fail failur fall fals

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 1 2 1 0 0

3 0 0 0 0 0 3 0 0 0

fault favor fax feder fee fell femal field fight

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

final financi finish fire firm floor focus foot forc

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

form formal former forward fractur free fund futur

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

game gender gone grade graduat grant grievanc ground

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 1 0 0

3 0 0 0 1 1 0 0 0

group hand happi harass head health hear held higher

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

hire histori hit hold home hospit hostil hous human

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

ice identifi immedi immun impact import impress

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

improv inappropri inclin incur indemn individu injur

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

injuri inquir inquiri inspect instruct intent

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

interest intern invoic job joint judg judgment juri

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 1 0 0 0 0 0 0

jurisdict key knee knowledg lacer lack larg latest

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

law lawyer layer learn leav leg legal letter level

1 0 0 0 0 0 0 0 1 0

2 0 1 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

liabil lien life limit litig live lmtcb local lose

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

loss lost low mail mainten major male manag materi

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

mcad med mediat medic medicar meet memo merit messag

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 2 0 0 0

million minor mom money monitor motion msj mtd

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

nation near neck neglig negoti news noth notic

1 1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

notifi numer nurs object oblig ocr offer offici ongo

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 2 0 0 0

open oper opinion opportun oppos opposit oral order

1 0 0 0 0 0 0 0 0

2 1 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

origin outlin outstand owe paid pain park parti

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

partner pass pay payment pend perman personnel petit

1 0 1 0 0 0 0 0 0

2 0 1 0 0 0 0 0 0

3 0 2 0 0 1 0 0 0

phone photo physic physician pictur plan player

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

plead poa polic polici poor postpon potenti practic

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

preliminari premis prepar pres presid press pressur

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

prevail prevent primari privat proceed product

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

profession professor progress project promis promot

1 0 0 0 0 0 0

2 0 1 0 0 0 0

3 0 2 0 0 0 0

proper properti propos protect provis provost pull

1 0 0 0 0 0 0 0

2 0 0 0 0 0 1 0

3 0 0 0 0 0 0 0

punit pursu push qualifi quick quiet quit race rais

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

rang rate reach recal receipt recov recoveri rediari

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

reduc reimburs reinsur reject relationship releas

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

relief remain remedi remov renew reopen rep repair

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 1 0 0 0 0 0

repeat. replac repli repres represent research

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

reserv resid resign resolut resolv respect respond

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

rest retain retali retent retir return reveal review

1 0 0 0 0 0 0 0 2

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 1

revis risk role ror rts rule run safeti salari

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

schedul search section secur select semest separ

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

serious serv servic settl settlement sex sexual

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

shoulder side sidewalk sign signific sir sit site

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

situat slip small snow speak spent split staff stage

1 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0

stair standard statement status statut step stop

1 0 0 0 0 0 0 0

2 0 0 0 2 0 0 0

3 0 0 0 0 0 0 0

stori strategi street strike struck studi subject

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

substanti success sue suffer suffici suggest summari

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

supervis supervisor supplement supv surgeri suspect

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

suspend sustain system tabl tcw teach teacher team

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

telephon tender tenur term termin test testifi

1 0 0 0 0 0 0 0

2 0 0 0 0 0 1 0

3 0 0 0 0 0 0 0

testimoni theori threaten titl top total tpa track

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

train transcript transfer transport travel treat

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

treatment trial trip troubl tuition unabl unclear

1 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0

unfortun upcom updat vacat valu vehicl verdict video

1 0 0 1 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

violat visitor voicemail wage wait walk warn watch

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 0 0

water weak white win withdraw worker write written

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 0 0 0 1 0

wrote xbocx xdolx ximex xmsjx xnpcx xoopx xprosex

1 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 1 0 0 0 0 0 0 0

xsolx

1 0

2 0

3 0

I am trying to group the frequency values by month/year and year. For example, for the word "appeal", instead of having 2 occurrences on 1/5/1998, and another occurrence on 1/5/1998, I would like to have 3 occurrences for 1/1998, and then also 3 occurrences (assuming there aren't any more hits for the rest of the year) for 1998. Then I would like to plot the frequency per month/year vs. month/year, and the frequency per year vs. year.

I tried using the following code to group by month/year:

df %>%

mutate(month_year = format(date, "%Y/%m")) %>%

group_by(month_year) %>%

summarise(total = sum(vocabfreq))

where value are all of the columns with the frequency of words in the original data set. Another problem is that my data set is quite large, and I am having difficulty plotting multiple series on one graph that shows distinctive features.

网友答案:

The xts method:

library(xts)
dat <- data.frame(date=c('7/10/2014', '7/10/2014', '7/11/2014', '8/05/2015', '9/21/2015'),
                  word1= c(1,2,1, 4, 3), word2=c(3, 10, 1, 2, 4))
dates <- as.POSIXct(dat$date, format='%m/%d/%Y')
dat.xts <- xts(subset(dat, select= -date), order.by=dates)
apply.daily(dat.xts, colSums)
apply.monthly(dat.xts, colSums)
网友答案:

You should use summarise_each instead of summarise. Btw, I'm using @DunderChief's code to generate the data. Thank you for that.

dat <- data.frame(date=c('7/10/2014', '7/10/2014', '7/11/2014', '8/05/2015', '9/21/2015'),
              word1= c(1,2,1, 4, 3), word2=c(3, 10, 1, 2, 4))
library(dplyr)

dat %>%
  mutate(date = as.Date(date, format='%m/%d/%Y')) %>%
  group_by(date) %>%
  summarise_each(funs(sum(.)))
相关阅读:
Top