Dataset Information

A multi-class classification model for supporting the diagnosis of type II diabetes mellitus.

ABSTRACT:

Background

Numerous studies have utilized machine-learning techniques to predict the early onset of type 2 diabetes mellitus. However, fewer studies have been conducted to predict an appropriate diagnosis code for the type 2 diabetes mellitus condition. Further, ensemble techniques such as bagging and boosting have likewise been utilized to an even lesser extent. The present study aims to identify appropriate diagnosis codes for type 2 diabetes mellitus patients by means of building a multi-class prediction model which is both parsimonious and possessing minimum features. In addition, the importance of features for predicting diagnose code is provided.

Methods

This study included 149 patients who have contracted type 2 diabetes mellitus. The sample was collected from a large hospital in Taiwan from November, 2017 to May, 2018. Machine learning algorithms including instance-based, decision trees, deep neural network, and ensemble algorithms were all used to build the predictive models utilized in this study. Average accuracy, area under receiver operating characteristic curve, Matthew correlation coefficient, macro-precision, recall, weighted average of precision and recall, and model process time were subsequently used to assess the performance of the built models. Information gain and gain ratio were used in order to demonstrate feature importance.

Results

The results showed that most algorithms, except for deep neural network, performed well in terms of all performance indices regardless of either the training or testing dataset that were used. Ten features and their importance to determine the diagnosis code of type 2 diabetes mellitus were identified. Our proposed predictive model can be further developed into a clinical diagnosis support system or integrated into existing healthcare information systems. Both methods of application can effectively support physicians whenever they are diagnosing type 2 diabetes mellitus patients in order to foster better patient-care planning.

SUBMITTER: Kuo KM

PROVIDER: S-EPMC7487151 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A multi-class classification model for supporting the diagnosis of type II diabetes mellitus.

Kuo Kuang-Ming KM Talley Paul P Kao YuHsi Y Huang Chi Hsien CH

PeerJ 20200910

<h4>Background</h4>Numerous studies have utilized machine-learning techniques to predict the early onset of type 2 diabetes mellitus. However, fewer studies have been conducted to predict an appropriate diagnosis code for the type 2 diabetes mellitus condition. Further, ensemble techniques such as bagging and boosting have likewise been utilized to an even lesser extent. The present study aims to identify appropriate diagnosis codes for type 2 diabetes mellitus patients by means of building a mu ...[more]

PMID: 32974105

Similar Datasets

Project description:Due to the increasing prevalence of type 1 diabetes mellitus (T1DM) and its complications, there is an urgent need to identify novel methods for predicting the occurrence and understanding the pathogenetic mechanisms of the disease. Accumulated data have demonstrated the potential of long noncoding RNAs (lncRNAs), as biomarkers in establishing diagnosis and predicting prognosis of numerous diseases. Yet, little is known about the expression patterns and regulatory roles of lncRNAs in the pathogenesis of T1DM and whether they can be used as diagnostic biomarkers for the disease. To further explore these questions, in the present study, we conducted a comparative analysis of the expression patterns of lncRNAs between 20 T1DM patients and 42 health controls by retrospectively analyzing a published microarray data set. Our results indicate that, compared with healthy controls, diabetic patients had altered levels of lncRNAs. Then, we used three time cross-validation strategy and support vector machine to propose a specific 26-lncRNA signature (termed 26LncSigT1DM). This 26LncSigT1DM signature can be used to effectively distinguish between healthy and diabetic individuals (area under the curve = 0.825) of a validation cohort. After the 26LncSigT1DM was prospectively validated, we used Pearson correlation to identify 915 mRNAs, whose expression levels were positively correlated with those of the 26 lncRNAs. According to their Gene Ontology annotations, these mRNAs participate in processes including cellular response to stimulus, cell communication, multicellular organismal process, and cell motility. Kyoto Encyclopedia of Genes and Genomes analysis demonstrated that the genes encoding the 915 mRNAs may be associated with the NOD-like receptor signaling pathway, transforming growth factor β signaling pathway, and mineral absorption, suggesting that the deregulation of these lncRNAs may mediate inflammatory abnormalities and immune dysfunctions, which jointly promote the pathogenesis of T1DM. Thus, our study identifies a novel diagnostic tool and may shed more light on the molecular mechanisms underlying the pathogenesis of T1DM.

Dataset Information

A multi-class classification model for supporting the diagnosis of type II diabetes mellitus.

Background

Methods

Results

Publications

A multi-class classification model for supporting the diagnosis of type II diabetes mellitus.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets