问题描述:

Lets say that I have several R objects, e.g. lm outputs:

m1 <- lm(x ~ y, data = data, subset = sample==1)

m2 <- lm(x ~ y, data = data, subset = sample==2)

m3 <- lm(x ~ y, data = data, subset = sample==3)

m4 <- lm(x ~ y, data = data, subset = sample==4)

and now I want to average those objects, i.e. I want to average all estimates produced by lm. I would be very happy If I could get summary statistics of all the parameters in the objects, i.e. average intercept etc. What simplifies the problem is that all the objects would be roughly the same, just calculated on different samples.

Is there any way to do this in a general fashion, that is, using a single general function rather that taking all the individual values and averaging them one at a time? Also, I would need this kind of function for different kinds of objects.

Probably lapply could be used in some way, however how to deal with multiple (varying) layers of nesting?

网友答案:

This should work (example using the mtcars dataset):

library(dplyr)
meanpars <- mtcars %>%
              group_by(cyl) %>%
              do(mod = lm(mpg ~ wt, data = .)) %>%
              summarise(
               intercepts = coef(mod)[1],
               wtbeta     = coef(mod)[2]) %>%
              summarise(
               meaninter = mean(intercepts),
               meanbeta = mean(wtbeta))

Here's with your toy data plugged in:

library(dplyr)
meanpars <- data %>%
              group_by(sample) %>%
              do(mod = lm(x ~ y, data = .)) %>%
              summarise(
               intercepts = coef(mod)[1],
               ybeta     = coef(mod)[2]) %>%
              summarise(
               meaninter = mean(intercepts),
               meanbeta = mean(ybeta))

Edit: If you don't want to average the coefficients in the end, just remove the last summarise function and you'll still get a data.frame with the results from your models.

相关阅读:
Top