Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- You can do this by combining [dplyr](http://cran.r-project.org/web/packages/dplyr/index.html), [tidyr](http://cran.r-project.org/web/packages/tidyr/index.html) and my [broom](https://github.com/dgrtwo/broom) package (you can install them with `install.packages`). First you need to gather all the numeric columns into a single column:
- library(dplyr)
- library(tidyr)
- tidied <- myDat %>%
- gather(column, value, -X, -Recipe, -Step, -Stage, -Prod)
- (This assumes that all columns besides X, Recipe, Step, Stage, and Prod are numeric and therefore should be predicted in your regression. If that's not the case, you need to remove them beforehand. You'll need to produce a reproducible example of the problem if you need a more customized solution).
- Then perform each regression, while grouping by the column and the four grouping variables.
- library(broom)
- regressions <- tidied %>%
- group_by(column, Recipe, Step, Stage, Prod) %>%
- do(mod = lm(value ~ X))
- glances <- regressions %>% glance(mod)
- The resulting `glances` data frame will have one row for each combination of column, Recipe, Step, Stage, and Prod, along with an `r.squared` column containing the R-squared from each model. (It will also contained `adj.r.squared`, along with an F-test p-value column `p.value`: see [here](https://github.com/dgrtwo/broom#tidying-functions) for more). Running `coefs <- regressions %>% tidy(mod)` will probably also be useful for you, as it will get the coefficient estimates and p-values from each regression.
- A similar use case is described in the ["broom and dplyr" vignette](http://cran.r-project.org/web/packages/broom/vignettes/broom_and_dplyr.html), and in Section 3.1 of [this paper](http://arxiv.org/abs/1412.3565).
Advertisement
Add Comment
Please, Sign In to add comment