Random sample of rows from subset of an R dataframe -
this question has answer here:
- sample random rows in dataframe 7 answers
is there way of getting sample of rows part of dataframe?
if have data such
gender <- c("f", "m", "m", "f", "f", "m", "f", "f") age <- c(23, 25, 27, 29, 31, 33, 35, 37)
then can sample ages of 3 of fs
sample(age[gender == "f"], 3)
and
[1] 31 35 29
but if turn data dataframe
mydf <- data.frame(gender, age)
i cannot use obvious
sample(mydf[mydf$gender == "f", ], 3)
though can concoct convoluted absurd number of brackets
mydf[sample((1:nrow(mydf))[mydf$gender == "f"], 3), ]
and want
gender age 7 f 35 4 f 29 1 f 23
is there better way takes me less time work out how write?
your convoluted way pretty how - think answers variations on theme.
for example, generate mydf$gender=="f"
indices first:
idx <- which(mydf$gender=="f")
then sample that:
mydf[ sample(idx,3), ]
so in 1 line (although, reduce absurd number of brackets , possibly make code easier understand having multiple lines):
mydf[ sample( which(mydf$gender=='f'), 3 ), ]
while "wheee i'm hacker!" part of me prefers one-liner, sensible part of me says though two-liner 2 lines, more understandable - it's choice.
Comments
Post a Comment