Sunday, April 4, 2010

Subsetting data frame in R

The following is taken from “Statistical Computing and Graphics Course Notes” by Frank E. Harrell. http://cran.r-project.org/doc/contrib/Harrell-statcomp-notes.pdf

> # Subset a simple vector
> x1 <- 1:4
> sex <- rep(c(’male’,’female’),2)
> subset(x1, sex==’male’)
[1] 1 3

> # Subset a data frame
> d <- data.frame(x1=x1, x2=(1:4)/10, x3=(11:14), sex=sex)
> d
x1 x2 x3 sex
1 1 0.1 11 male
2 2 0.2 12female
3 3 0.3 13 male
4 4 0.4 14 female

> subset(d, sex==’male’)
x1 x2 x3 sex
1 1 0.1 11 male
3 3 0.3 13 male

> subset(d, sex==’male’ & x2>0.2)
x1 x2 x3 sex
3 3 0.3 13 male

> subset(d, x1>1, select=-x1)
x2 x3 sex
2 0.2 12 female
3 0.3 13 male
4 0.4 14 female

> subset(d, select=c(x1,sex))
x1 sex
1 1 male
2 2 female
3 3 male
4 4 female

> subset(d, x2<0.3, select=x2:sex)
x2 x3 sex
1 0.1 11 male
2 0.2 12 female

> subset(d, x2<0.3, -(x3:sex))
x1 x2
1 1 0.1
2 2 0.2

> attach(subset(d, sex==’male’ & x3==11, x1:x3))

No comments:

Post a Comment