问题描述:

I have a data-set which has columns as

x1 x2 x3 x4 x5 y

all of them has integer / float value and Y values ranges from 98,000 to 1,10,000

If I want to find the relationship between x1 and y , x2 and y ... x5 and y and come up with

y = A.x1+c

how should i do it?

I tried plotting graphs and also tried lm() and fit() functions in R.

`fit <- lm(Y~X1+X2+X3+X4+X5,data=data)`

step <- stepAIC(fit, direction="both")

Kindly help.

I think it should use some specialsed package that find best linear/relation between `y`

and variable `xi`

. You can see for example `leaps`

package.

You can also find the relation by looping over all your xi. Here one way to do it. Firest I warp you code in a function. And I use the `dot formula`

notation.

```
lm_col <-
function(var,data){
fit <- lm(y~.,subset(data,select=c('y',var)))
stepAIC(fit, direction="both")
}
```

Then you loop over all you variables using `lapply`

:

```
lapply(paste0('x',seq(5)),lm_col,data=dat)
```

You can test this using this data:

```
dat <- as.data.frame(matrix(rnorm(6*10),ncol=6))
colnames(dat) <- c(paste0('x',seq(5)),'y')
```

But as I said at the beginning, I don't think that this is the best way to do what you want to do ( not very clear) statistically speaking.