r - Dependency matrix -
i need build dependency matrix 91 variables of data-set.
i tried use codes, didn't succeed.
here part of important codes:
p<- length(dati) chisquare <- matrix(dati, nrow=(p-1), ncol=p)
it should create squared-matrix variables
system.time({for(i in 1:p){ for(j in 1:p){ <- dati[, rn[i+1]] b <- dati[, cn[j]] chisquare[i, (1:(p-1))] <- chisq.test(dati[,i], dati[, i+1])$statistic chisquare[i, p] <- chisq.test(dati[,i], dati, i+1])$p.value }} })
it should relate "p" variables analyze whether dependent each other
error in `[.data.frame`(dati, , rn[i + 1]) : not defined columns selected moreover: there 50 , more alerts (use warnings() read first 50) timing stopped at: 32.23 0.11 32.69 warnings() #let's check >: in chisq.test(dati[, i], dati[, + 1]) : chi-squared approximation may incorrect
chisquare
#all cells (unless in last column seems have p-values) have same values row
i tried way, provided me knows how manage r better me:
#strange values have in columns sum(dati == 'x') #replacing "x" x x <- dati[dati=='x'] #distribution of answers each question answers <- t(sapply(1:ncol(dati), function(i) table(factor(dati[, i], levels = -2:9), usena = 'always'))) rownames(answers) <- colnames(dati) answers #correlation pairs i<- diag(ncol(dati)) #empty diagonal matrix colnames(i) <- rownames(i) <- colnames(dati) rn <- rownames(i) cn <- colnames(i) #loop system.time({ for(i in 1:ncol(dati)){ for(j in 1:ncol(spain)){ <- dati[, rn[i]] b <- dati[, cn[j]] r <- chisq.test(a,b)$statistic r <- chisq.test(a,b)$p.value i[i, j] <- r } } }) user system elapsed 29.61 0.09 30.70 there 50 , more alerts (use warnings() read first 50) warnings() #let's check -> : in chisq.test(a, b) : chi-squared approximation may incorrect diag(i)<- 1 #result head(i)
the columns stop @ 5th variable, whereas i need check dependency between variables. each one.
i don't understand i'm wrong, hope i'm not far...
i hope receive help, please.
you apparently trying compute p-value of chi-squared test, pairs of variables in dataset. can done follows.
# sample data n <- 1000 k <- 10 d <- matrix(sample(letters[1:5], n*k, replace=true), nc=k) d <- as.data.frame(d) names(d) <- letters[1:k] # compute p-values k <- ncol(d) result <- matrix(1, nr=k, nc=k) rownames(result) <- colnames(result) <- names(d) for(i in 1:k) { for(j in 1:k) { result[i,j] <- chisq.test( d[,i], d[,j] )$p.value } }
in addition, there may wrong data, leading warnings get, not know it.
your code has many problems me try enumerate them (you start try create square matrix different number of rows , columns, , lost).
Comments
Post a Comment