r - average time-distance between grouped events -


df battle events within years & conflicts. trying calculate average distance (in time) between battles within conflict years.

header looks this:

conflictid | year | event_date | event_type 107          1997   1997-01-01   1 107          1997   1997-01-01   1 20           1997   1997-01-01   1 20           1997   1997-01-01   2 20           1997   1997-01-03   1 

what first tried was

time_prev_total <- aggregate (event_date ~ conflictid + year, data, diff)

but end event_date being list in new df. attempts extract first index position of list within df have been unsuccessful.

alternatively suggested me create time index within each conflict year, lag index, create new data frame conflictid, year, event_date, , lagged index, , merge original df, match lagged index in new df old index in original df. have tried implement little unsure how index obs. within conflict years since unbalanced.

you can use ddply split data.frame pieces (one per year , conflict) , apply function each.

# sample data n <- 100 d <- data.frame(   conflictid = sample(1:3,       n, replace=true),   year       = sample(1990:2000, n, replace=true),   event_date = sample(0:364,     n, replace=true),   event_type = sample(1:10,      n, replace=true) ) d$event_date <- as.date(isodate(d$year,1,1)) + d$event_date library(plyr)  # average distance between battles, within each year , conflict ddply(   d,    c("year","conflictid"),    summarize,   average = mean(dist(event_date)) )  # average distance between consecutive battles, within each year , conflict d <- d[order(d$event_date),] ddply(   d,    c("year","conflictid"),    summarize,   average = mean(diff(event_date)) ) 

Comments

Popular posts from this blog

jasper reports - Fixed header in Excel using JasperReports -

media player - Android: mediaplayer went away with unhandled events -

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -