r - Group by aggregate dynamic column name matching -

is possible group_by using regex match on column names using dplyr?

library(dplyr) # dplyr_0.5.0; r version 3.3.2 (2016-10-31)  # dummy data set.seed(1) df1 <-  sample_n(iris, 20) %>%    mutate(sepal.length = round(sepal.length),          sepal.width = round(sepal.width))

group static version (looks/works fine, imagine if have 10-20 columns):

df1 %>%    group_by(sepal.length, sepal.width) %>%    summarise(mysum = sum(petal.length))

group dynamic - "ugly" version:

df1 %>%    group_by_(.dots = colnames(df1)[ grepl("^sepal", colnames(df1))]) %>%    summarise(mysum = sum(petal.length))

ideally, (doesn't work, starts_with returns indices):

df1 %>%    group_by(starts_with("sepal")) %>%    summarise(mysum = sum(petal.length))

error in eval(expr, envir, enclos) :     wrong result size (0), expected 20 or 1

expected output:

# source: local data frame [6 x 3] # groups: sepal.length [?] #  #   sepal.length sepal.width mysum #          <dbl>       <dbl> <dbl> # 1            4           3   1.4 # 2            5           3  10.9 # 3            6           2   4.0 # 4            6           3  43.7 # 5            7           3  15.7 # 6            8           4   6.4

note: sounds duplicated post, kindly link relevant posts if any.

~~this feature implemented in future release~~, reference github issue #2619:

solution use group_by_at function:

df1 %>%   group_by_at(vars(starts_with("sepal"))) %>%    summarise(mysum = sum(petal.length))

edit: implemented in dplyr_0.7.1

Search This Blog

Brent

r - Group by aggregate dynamic column name matching -

Comments

Post a Comment

Popular posts from this blog

inversion of control - Autofac named registration constructor injection -

ios - Change Storyboard View using Seague -

verilog - Systemverilog dynamic casting issues -