r - average time-distance between grouped events -
df battle events within years & conflicts. trying calculate average distance (in time) between battles within conflict years.
header looks this:
conflictid | year | event_date | event_type 107 1997 1997-01-01 1 107 1997 1997-01-01 1 20 1997 1997-01-01 1 20 1997 1997-01-01 2 20 1997 1997-01-03 1
what first tried was
time_prev_total <- aggregate (event_date ~ conflictid + year, data, diff)
but end event_date
being list in new df. attempts extract first index position of list within df have been unsuccessful.
alternatively suggested me create time index within each conflict year, lag index, create new data frame conflictid
, year
, event_date
, , lagged index, , merge original df, match lagged index in new df old index in original df. have tried implement little unsure how index obs. within conflict years since unbalanced.
you can use ddply
split data.frame pieces (one per year , conflict) , apply function each.
# sample data n <- 100 d <- data.frame( conflictid = sample(1:3, n, replace=true), year = sample(1990:2000, n, replace=true), event_date = sample(0:364, n, replace=true), event_type = sample(1:10, n, replace=true) ) d$event_date <- as.date(isodate(d$year,1,1)) + d$event_date library(plyr) # average distance between battles, within each year , conflict ddply( d, c("year","conflictid"), summarize, average = mean(dist(event_date)) ) # average distance between consecutive battles, within each year , conflict d <- d[order(d$event_date),] ddply( d, c("year","conflictid"), summarize, average = mean(diff(event_date)) )
Comments
Post a Comment