Random sample of rows from subset of an R dataframe -


this question has answer here:

is there way of getting sample of rows part of dataframe?

if have data such

gender <- c("f", "m", "m", "f", "f", "m", "f", "f") age    <- c(23, 25, 27, 29, 31, 33, 35, 37) 

then can sample ages of 3 of fs

sample(age[gender == "f"], 3) 

and

[1] 31 35 29 

but if turn data dataframe

mydf <- data.frame(gender, age)  

i cannot use obvious

sample(mydf[mydf$gender == "f", ], 3) 

though can concoct convoluted absurd number of brackets

mydf[sample((1:nrow(mydf))[mydf$gender == "f"], 3), ] 

and want

  gender age 7      f  35 4      f  29 1      f  23 

is there better way takes me less time work out how write?

your convoluted way pretty how - think answers variations on theme.

for example, generate mydf$gender=="f" indices first:

idx <- which(mydf$gender=="f") 

then sample that:

mydf[ sample(idx,3), ] 

so in 1 line (although, reduce absurd number of brackets , possibly make code easier understand having multiple lines):

mydf[ sample( which(mydf$gender=='f'), 3 ), ] 

while "wheee i'm hacker!" part of me prefers one-liner, sensible part of me says though two-liner 2 lines, more understandable - it's choice.


Comments

Popular posts from this blog

jasper reports - Fixed header in Excel using JasperReports -

media player - Android: mediaplayer went away with unhandled events -

python - ('The SQL contains 0 parameter markers, but 50 parameters were supplied', 'HY000') or TypeError: 'tuple' object is not callable -