string - Subsetting in R, joining and calculating multiple repetitions -



string - Subsetting in R, joining and calculating multiple repetitions -

here sample:

> tmp label value1 value2 1 aa_x_x xx xx 2 bc_x_x xx xx 3 aa_x_x xx xx 4 bc_x_x xx xx

how calculate median of repeated labels (or more, of corresponding values in other info frame columns), taking business relationship first 2 letters (ie. "aa_1_1" , "aa_s_3" same values)? list of labels finite , usable.

i have read aggregate, %in%, subset , substr, unable compile useful , simple.

here hope get:

> tmp.result label median1 some.calculation2 1 aa xx xx 2 bc xx xx 3 aa xx xx 4 bc xx xx

thank much.

have tried making new info frame--i'll phone call tmp2--where tmp2$label==substr(tmp$label,0,2)? there, can, example, utilize tapply(tmp2$value1,tmp2$label,mean) average values of value1 aggregated on tmp2$label.

an alternative using dplyr

library(dplyr) tmp %>% group_by(label=sub('_.*$', '', label)) %>% transmute(median1=median(value1), mean1=mean(value2))

or data.table

library(data.table) setdt(tmp)[, c('median1', 'mean1') := list(median(value1), mean1= mean(value2)) , .(label=sub('_.*$', '', label))][, c(1,4:5), with=false]

string r condition data.frame subset

Comments

Popular posts from this blog

iphone - Dismissing a UIAlertView -

intellij idea - Update external libraries with intelij and java -

javascript - send data from a new window to previous window in php -