Ontology highlight
ABSTRACT: Background
Despite the importance of accurate Sasang type diagnosis, a unique form of Korean medicine, there have been concerns about consistency among diagnoses. We investigate a data-driven integrative diagnostic model by applying machine learning to a multicenter clinical dataset with comprehensive features.Methods
Extremely randomized trees (ERT), support vector machines, multinomial logistic regression, and K-nearest neighbor were applied, and performances were evaluated by cross-validation. The feature importance of the classifier was analyzed to understand which information is crucial in diagnosis.Results
The ERT classifier showed the highest performance, with an overall f1 score of 0.60 ± 0.060. The feature classes of body measurement, personality, general information, and cold-heat were more decisive than others in classifying Sasang types. Costal angle was the most informative feature. In pairwise classification, we found Sasang type-dependent distinctions that body measurement features played a key role in TE-SE and TE-SY datasets, while personality and cold-heat features showed importance in SE-SY dataset.Conclusion
Current study investigated a comprehensive diagnostic model for Sasang type using machine learning and achieved better performance than previous studies. This study helps data-driven decision making in clinics by revealing key features contributing to the Sasang type diagnosis.
SUBMITTER: Park SY
PROVIDER: S-EPMC7903349 | biostudies-literature | 2021 Sep
REPOSITORIES: biostudies-literature
Park Sa-Yoon SY Park Musun M Lee Won-Yung WY Lee Choong-Yeol CY Kim Ji-Hwan JH Lee Siwoo S Kim Chang-Eop CE
Integrative medicine research 20200930 3
<h4>Background</h4>Despite the importance of accurate Sasang type diagnosis, a unique form of Korean medicine, there have been concerns about consistency among diagnoses. We investigate a data-driven integrative diagnostic model by applying machine learning to a multicenter clinical dataset with comprehensive features.<h4>Methods</h4>Extremely randomized trees (ERT), support vector machines, multinomial logistic regression, and K-nearest neighbor were applied, and performances were evaluated by ...[more]