R语言模型选择之精度准则与最大值法问题

在模型选择中我们一般用caret包train函数建立模型,并对模型进行评判

方法1:

set.seed(1234)
tr_control<-trainControl(method = 'cv',number = 5)
# 创建随机森林模型
model_rf<-train(Class~.,data=traindata,
                trControl=tr_control,method='rf')
model_rf

输出

mtry Accuracy Kappa
2 0.9276465 0.8552977
16 0.9314521 0.8628921
30 0.9276627 0.8553120

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 16.

方法2

set.seed(1234)
model_rf <- train(Class ~., data = traindata, method = 'rf', 
                  trControl = trainControl(method = 'cv', 
                                           number = 5, 
                                           selectionFunction = 'oneSE'))
model_rf

mtry Accuracy Kappa
2 0.9276143 0.8552365
16 0.9212771 0.8425685
30 0.9250988 0.8502003

Accuracy was used to select the optimal model using the one SE rule.
The final value used for the model was mtry = 2.

可以看到二者选定的模型并不一样,而且选定的标准也不一样,方法1标准是最大值法,方法2是精确度。

原因在方法2中用了:selectionFunction = 'oneSE'

原文地址:https://www.cnblogs.com/Grayling/p/11235441.html