r - Accuracy from Adaboost in caret not re-producable -
i using caret package train adaboost algorithm.
this training output
> adaboostfit1 bagged adaboost 507 samples 49 predictor 4 classes: 'a', 'b', 'c', 'd' no pre-processing resampling: cross-validated (10 fold, repeated 10 times) summary of sample sizes: 455, 456, 458, 458, 457, 456, ... resampling results across tuning parameters: maxdepth mfinal accuracy kappa 1 50 0.7252999 0.5148544 1 100 0.7249040 0.5143589 1 150 0.7245041 0.5138308 1 200 0.7248961 0.5144177 3 50 0.8449199 0.7246524 3 100 0.8457983 0.7262998 3 150 0.8455672 0.7258904 3 200 0.8459633 0.7266197 5 50 0.8927122 0.8134603 5 100 0.8939299 0.8157804 5 150 0.8941498 0.8162610 5 200 0.8949189 0.8177514 7 50 0.8967421 0.8208284 7 100 0.8965234 0.8205643 7 150 0.8953196 0.8184594 7 200 0.8953154 0.8183581 9 50 0.8966646 0.8210096 9 100 0.8974849 0.8225458 9 150 0.8967082 0.8211391 9 200 0.8970887 0.8217614 11 50 0.8959179 0.8196260 11 100 0.8952882 0.8183883 11 150 0.8962304 0.8200345 11 200 0.8962304 0.8199575 kappa used select optimal model using largest value. final values used model mfinal = 100 , maxdepth = 9. the best model therefore 1 kappa of 0.8225458 , accuracy of 0.8974849. extract predictions using:
adaboostpre2<-predict(adaboostfit1) and confusion matrix training data , perfect fit, when expecting 1 accuracy of 0.8975!
confusion matrix , statistics reference prediction b c d 265 0 0 0 b 0 186 0 0 c 0 0 11 0 d 0 0 0 45 overall statistics accuracy : 1 95% ci : (0.9928, 1) no information rate : 0.5227 p-value [acc > nir] : < 2.2e-16 kappa : 1 mcnemar's test p-value : na statistics class: class: class: b class: c class: d sensitivity 1.0000 1.0000 1.0000 1.00000 specificity 1.0000 1.0000 1.0000 1.00000 pos pred value 1.0000 1.0000 1.0000 1.00000 neg pred value 1.0000 1.0000 1.0000 1.00000 prevalence 0.5227 0.3669 0.0217 0.08876 detection rate 0.5227 0.3669 0.0217 0.08876 detection prevalence 0.5227 0.3669 0.0217 0.08876 balanced accuracy 1.0000 1.0000 1.0000 1.00000 does have explanation why might case. related voting amonst various learners? trying other algorithms , c5.0 similary gives perfect fit. ideally access predictions gave accuracy of 0.8975. thanks.
Comments
Post a Comment