R reshape data from long to wide and vice versa -


i wrote 2 wrapper functions cast , melt bring data long wide form , vice versa. however, still struggle function reshape_wide brings data long form wide form.

here example functions plus code run it. created dummy data.frame in wide format reshape long format using reshape_long function , transform original wide form using reshape_wide function. however, reshaping fails reason cannot figure it. seems formula used in dcast wrong.

reshape_long <- function(data, identifiers) {     data_long <- melt(data, id.vars = identifiers,                              variable.name="name", value.name="value")     data_long$value <- as.numeric(data_long$value)     data_long <- data_long[!is.na(data_long$value), ]     return(data_long) }  reshape_wide <- function(data, identifiers, name) {     if(is.null(identifiers)) {         formula_wide <- as.formula(paste(paste(identifiers,collapse="+"),                                     "series ~ ", name))           } else {         formula_wide <- as.formula(paste(paste(identifiers,collapse="+"),                                     "+ series ~ ", name))     }     series <- ave(1:nrow(data), data$name, fun=function(x) { seq.int(along=x) })      data <- cbind(data, series)      data_wide <- dcast(data, formula_wide, value.var="value")     data_wide <- data_wide[,!(names(data_wide) %in% "series")]     return(data_wide) }   data <- data.frame(id = rep("k", 6), type = c(rep("a", 3), rep("b", 3)),                    x = c(na,na,1,2,3,4), y = 5:10, z = c(na,11,12,na,14,na)) data <- reshape_long(data, identifiers = c("id", "type")) data reshape_wide(data, identifiers = c("id", "type"), name="name") 

here link r output when run code above:

http://pastebin.com/ej8f9gnl

what wrong in column type b appears 5 times rather 3 times should be. same data.frame?

here r output sessioninfo()

> sessioninfo() r version 2.14.0 (2011-10-31) platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)  locale: [1] c  attached base packages: [1] grid      stats     graphics  grdevices utils     datasets  methods   [8] base       other attached packages:  [1] reshape2_1.2.1       outliers_0.14        lme4_0.999375-42      [4] matrix_1.0-1         gregmisc_2.1.2       gplots_2.10.1         [7] kernsmooth_2.23-7    catools_1.12         bitops_1.0-4.1       [10] gtools_2.6.2         gmodels_2.15.1       gdata_2.8.2          [13] lattice_0.20-0       dataframes2xls_0.4.5 rankprod_2.26.0      [16] r.utils_1.9.3        r.oo_1.8.3           r.methodss3_1.2.1    [19] xlsx_0.3.0           xlsxjars_0.3.0       rjava_0.9-2          [22] rj_1.0.0-3            loaded via namespace (and not attached): [1] mass_7.3-16   nlme_3.1-102  plyr_1.6      rj.gd_1.0.0-1 stats4_2.14.0 [6] stringr_0.5   tools_2.14.0  

the example cannot work: since id , type not form primary key (i.e., since there several rows same id , type), when data put in tall format, no longer know if 2 values come same row.

also, not sure trying series column, not seem work.

library(reshape2) d <- data.frame(   id = rep("k", 6),    type = c(rep("a", 3), rep("b", 3)),   x = c(na,na,1,2,3,4),    y = 5:10,    z = c(na,11,12,na,14,na) ) d$row <- seq_len(nrow(d))  # (row,id,type) primary key d d1 <- reshape_long(d, identifiers = c("row", "id", "type")) d1 dcast(d1, row + id + type ~ name) # want reshape_wide(d1, identifiers = c("row", "id", "type"), name="name") 

Comments

Popular posts from this blog

jasper reports - Fixed header in Excel using JasperReports -

media player - Android: mediaplayer went away with unhandled events -

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -