bayesian - Weka machine learning - Interpeting naive bayes -


i got training dataset of ill horses, data contains surgeries , diseases. of fields of registers like: temperature of horse, age, pulse, respiratory rate etc ....

what want clasificator on live/dead/euthanized column of every row. asked check is:

  • think hypothesis of independence of variables
  • check if got enought number of elements obtain reliable probabilities

the dataset had 25% of missing values , them imputated using mimmi imputation.

thinking possibility of getting reliable probabilities, can see training dataset little unbalanced: 179 horses live , 121 die (dead + euthanized). im not sure of that. 2 questions helpful me.

=== run information ===  scheme:weka.classifiers.bayes.naivebayes  relation:     horsecolic-weka.filters.unsupervised.attribute.remove-r25-27 instances:    300 attributes:   24               surgery               age               id               temp               pulse               resprate               tempextrem               peripulse               mucmemb               capreft               pain               peri               abddist               ngtube               ngreflux               ngrph               feces               abd               pcellvol               totprot               abdcentapp               abdcenttotprot               outc               surgles test mode:10-fold cross-validation  === classifier model (full training set) ===  naive bayes classifier                                    class attribute                         lived         died   euthanized                                  (0.59)       (0.26)       (0.15) ================================================================== surgery   yes                               97.0         59.0         28.0   no                                84.0         20.0         18.0   [total]                          181.0         79.0         46.0  age   adult                            168.0         67.0         44.0   young                             13.0         12.0          2.0   [total]                          181.0         79.0         46.0  id   mean                      1009274.0202 1452556.3598  751596.8611   std. dev.                 1431022.1677 1887025.7703  989556.6807   weight sum                         179           77           44   precision                    16915.735    16915.735    16915.735  temp   mean                           34.8733      35.0055       33.054   std. dev.                      10.2335      13.0545      14.9588   weight sum                         179           77           44   precision                       0.9275       0.9275       0.9275  pulse   mean                           29.2039      33.2115      29.0187   std. dev.                      10.8578      14.6404      16.7248   weight sum                         179           77           44   precision                       0.9107       0.9107       0.9107  resprate   mean                           15.0771      16.9169      15.9348   std. dev.                       8.9803       7.0278       8.1221   weight sum                         179           77           44   precision                       0.8667       0.8667       0.8667  tempextrem   normal                            82.0         16.0         12.0   warm                              36.0          7.0          3.0   cool                              53.0         48.0         25.0   cold                              12.0         10.0          8.0   [total]                          183.0         81.0         48.0  peripulse   normal                           133.0         22.0         11.0   increased                          5.0          8.0          7.0   reduced                           43.0         47.0         25.0   absent                             2.0          4.0          5.0   [total]                          183.0         81.0         48.0  mucmemb   normal-pink                       95.0          9.0          7.0   bright-pink                       23.0         13.0          6.0   pale-pink                         37.0         19.0         12.0   pale-cyanotic                     16.0         17.0         12.0   bright-red                         7.0         14.0          8.0   dark-cyanotic                      7.0         11.0          5.0   [total]                          185.0         83.0         50.0  capreft   short                            153.0         46.0         23.0   long                              28.0         33.0         23.0   long2                              1.0          1.0          1.0   [total]                          182.0         80.0         47.0  pain   no-pain                           53.0          6.0          8.0   depressed                         42.0         21.0         14.0   inte-mild-pain                    64.0         10.0          8.0   inte-severe-pain                  12.0         18.0         12.0   cont-severe-pain                  13.0         27.0          7.0   [total]                          184.0         82.0         49.0  peri   hypermotile                       42.0          7.0          7.0   normal                            22.0          8.0          5.0   hypomotile                        90.0         37.0         17.0   absent                            29.0         29.0         19.0   [total]                          183.0         81.0         48.0  abddist   none                              88.0         17.0         13.0   slight                            53.0         18.0          8.0   moderate                          28.0         30.0         14.0   severe                            14.0         16.0         13.0   [total]                          183.0         81.0         48.0  ngtube   none                              79.0         40.0         27.0   slight                            90.0         32.0         15.0   significant                       13.0          8.0          5.0   [total]                          182.0         80.0         47.0  ngreflux   none                             149.0         50.0         30.0                                17.0         15.0          6.0   less                              16.0         15.0         11.0   [total]                          182.0         80.0         47.0  ngrph   mean                           11.3797      13.0882       8.0606   std. dev.                       2.3535       3.2916       5.1673   weight sum                         179           77           44   precision                       0.7917       0.7917       0.7917  feces   normal                            77.0         14.0         10.0   increased                         16.0         14.0          8.0   decreased                         44.0         15.0         11.0   absent                            46.0         38.0         19.0   [total]                          183.0         81.0         48.0  abd   normal                            48.0         13.0          4.0   other                             39.0          5.0          7.0   firm-large-intestine              18.0          8.0          6.0   dist-small-intest                 32.0         24.0          8.0   distended-large-intest            47.0         32.0         24.0   [total]                          184.0         82.0         49.0  pcellvol   mean                           31.0162      47.0465      46.0112   std. dev.                      14.1207      18.5468       17.672   weight sum                         179           77           44   precision                       0.9518       0.9518       0.9518  totprot   mean                           42.6539       41.451      43.7936   std. dev.                      16.9138      18.6362      19.3247   weight sum                         179           77           44   precision                       0.9432       0.9432       0.9432  abdcentapp   clear                            112.0         25.0         10.0   cloudy                            54.0         22.0         20.0   serosanguinous                    16.0         33.0         17.0   [total]                          182.0         80.0         47.0  abdcenttotprot   mean                           16.1341      21.1634      14.3203   std. dev.                       6.8038       4.9109       8.6619   weight sum                         179           77           44   precision                       0.8837       0.8837       0.8837  surgles   yes                               94.0         70.0         30.0   no                                87.0          9.0         16.0   [total]                          181.0         79.0         46.0    time taken build model: 0.01 seconds  === stratified cross-validation === === summary ===  correctly classified instances         216               72      % incorrectly classified instances        84               28      % kappa statistic                          0.5134 mean absolute error                      0.1965 root mean squared error                  0.3803 relative absolute error                 52.8451 % root relative squared error             88.2672 % total number of instances              300       === detailed accuracy class ===                 tp rate   fp rate   precision   recall  f-measure   roc area  class                  0.777     0.198      0.853     0.777     0.813      0.873    lived                  0.675     0.175      0.571     0.675     0.619      0.871    died                  0.568     0.082      0.543     0.568     0.556      0.824    euthanized weighted avg.    0.72      0.175      0.735     0.72      0.725      0.865  === confusion matrix ===       b   c   <-- classified  139  28  12 |   = lived   16  52   9 |   b = died    8  11  25 |   c = euthanized 

naive bayes has prominent assumption attributes independent. meaning in case age, surgery, temp taken mutually independent. may not case though, , in many instances not. naive bayes obtain decent results little training, not model in assumptions more correct. finding these models takes time , effort though, , naive bayes model reach adequate accuracy. not sure sample size, you'll have @ statistical power of dataset.


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -