I have a data-set which has columns as

x1 x2 x3 x4 x5 y

all of them has integer / float value and Y values ranges from 98,000 to 1,10,000

If I want to find the relationship between x1 and y , x2 and y ... x5 and y and come up with

y = A.x1+c

how should i do it?

I tried plotting graphs and also tried lm() and fit() functions in R.

``fit <- lm(Y~X1+X2+X3+X4+X5,data=data)step <- stepAIC(fit, direction="both")``

Kindly help.

I think it should use some specialsed package that find best linear/relation between `y` and variable `xi`. You can see for example `leaps` package.

You can also find the relation by looping over all your xi. Here one way to do it. Firest I warp you code in a function. And I use the `dot formula` notation.

``````lm_col <-
function(var,data){
fit <- lm(y~.,subset(data,select=c('y',var)))
stepAIC(fit, direction="both")
}
``````

Then you loop over all you variables using `lapply`:

`````` lapply(paste0('x',seq(5)),lm_col,data=dat)
``````

You can test this using this data:

``````dat <- as.data.frame(matrix(rnorm(6*10),ncol=6))
colnames(dat) <- c(paste0('x',seq(5)),'y')
``````

But as I said at the beginning, I don't think that this is the best way to do what you want to do ( not very clear) statistically speaking.

Top