A novel machine learning strategy for model selections - Stepwise Support Vector Machine (StepSVM).
Ontology highlight
ABSTRACT: An essential aspect of medical research is the prediction for a health outcome and the scientific identification of important factors. As a result, numerous methods were developed for model selections in recent years. In the era of big data, machine learning has been broadly adopted for data analysis. In particular, the Support Vector Machine (SVM) has an excellent performance in classifications and predictions with the high-dimensional data. In this research, a novel model selection strategy is carried out, named as the Stepwise Support Vector Machine (StepSVM). The new strategy is based on the SVM to conduct a modified stepwise selection, where the tuning parameter could be determined by 10-fold cross-validation that minimizes the mean squared error. Two popular methods, the conventional stepwise logistic regression model and the SVM Recursive Feature Elimination (SVM-RFE), were compared to the StepSVM. The Stability and accuracy of the three strategies were evaluated by simulation studies with a complex hierarchical structure. Up to five variables were selected to predict the dichotomous cancer remission of a lung cancer patient. Regarding the stepwise logistic regression, the mean of the C-statistic was 69.19%. The overall accuracy of the SVM-RFE was estimated at 70.62%. In contrast, the StepSVM provided the highest prediction accuracy of 80.57%. Although the StepSVM is more time consuming, it is more consistent and outperforms the other two methods.
SUBMITTER: Guo CY
PROVIDER: S-EPMC7451646 | biostudies-literature | 2020
REPOSITORIES: biostudies-literature
ACCESS DATA