问题描述:

I am trying to fit a XgBoost model in R, but when I run it, I get the error :

"At least one of the class levels is not a valid R variable name; This will cause errors when class probabilities are generated because the variables names will be converted to X0, X1 . Please use factor levels that can be used as valid R variable names".

Here is my code:

xgb_grid = expand.grid(

nrounds =100,

eta=c(0.3,0.1, 0.01, 0.001, 0.0001),

gamma=c(0,1),

max_depth=c(2,4,6,8,10),

min_child_weight=(0:200)*0.1,

subsample=(0:100)*0.01,

colsample_bytree=(0:100)*0.01

)

xgb_trcontrol = trainControl(

method = "repeatedcv",

number = 5,

verboseIter = TRUE,

returnData = FALSE,

returnResamp = "all",

classProbs = TRUE,

summaryFunction = twoClassSummary,

allowParallel = TRUE

)

caret::train()

xgb_train <- caret::train(

x = as.matrix(train_n[,predictorsNames]),

y= as.factor(train_n[,outcomeName]),

trControl =xgb_trcontrol,

tunegrid = xgb_grid,

method = "xgbTree",

metric = "ROC"

)

From what I read, it has to do with the levels of the y factor. But yet again, for Xgboost I recall that all variables have to be numeric, including y. Or am I wrong? How can I fix it ?

相关阅读:
Top