How does one express a linear model where observations can belong to multiple categories and the number of categories is large?

For example, using time dummies as the categories, here is a problem that is easy to set up since the number of categories (time periods) is small and known:

``tmp <- "day 1, day 20,11,01,1"periods <- read.csv(text = tmp)y <- rnorm(3)print(lm(y ~ day.1 + day.2 + 0, data=periods))``

Now suppose that instead of two days there were 100. Would I need to create a formula like the following?

``y ~ day.1 + day.2 + ... + day.100 + 0``

Presumably such a formula would have to be created programmatically. This seems inelegant and un-R-like.

What is the right R way to tackle this? For example, aside from the formula problem, is there a better way to create the dummies than creating a matrix of 1s and 0s (as I did above)? For the sake of concreteness, say that the actual data consists (for each observation) of a start and end date (so that `tmp` would contain a 1 in each column between start and end).

## Update:

Based on the answer of @jlhoward, here is a larger example:

``num.observations <- 1000# Manually create 100 columns of dummies called x1, ..., x100periods <- data.frame(1*matrix(runif(num.observations*100) > 0.5, nrow = num.observations))y <- rnorm(num.observations)print(summary(lm(y ~ ., data = periods)))``

It illustrates the manual creation of a data frame of dummies (1s and 0s). I would be interested in learning whether there is a more R-like way of dealing with these "multiple dummies per observation" issue.

You can use the `.` notation to include all variables other than the response in a formula, and `-1` to remove the intercept. Also, put everything in your data frame; don't make `y` a separate vector.

``````set.seed(1)    # for reproducibility