Hard Voting Meta Classifier for Disease Diagnosis using Mean Decrease in Impurity for Tree Models
DOI:
https://doi.org/10.18488/76.v9i2.3037Abstract
To predict and detect various diseases, machine learning techniques are increasingly being used in the field of medical science. This study puts forward a bagging meta-estimator and feed forward neural network based voting ensemble with mean decrease in impurity feature selection to classify the disease datasets. The work was carried out using the Jupyter notebook data analysis tool, and Python 3 is used as a programming language. In this study, two chronic disease datasets - Indian Liver Patient dataset and the PIMA Indians diabetes dataset are used for building and testing the proposed model. The datasets are split into training and testing data in the ratio of 70:30. The experimental results illustrate that our proposed voting ensemble has an improved performance compared to the individual base learners. We also compared the accuracy of the model before and after the application of feature reduction technique. The results revealed that the accuracy increased with the removal of unimportant features. By using the proposed ensemble model, the average MSE, bias and variance were calculated as 0.311, 0.217 and 0.094 respectively for ILPD dataset. Similarly for PIMA dataset, the average MSE, bias and variance were calculated as 0.233, 0.186 and 0.047 respectively. These statistical parameters record a low score for ensemble classifier as compared to the individual constituent classifiers.