Unknown

Dataset Information

0

Electronic Health Record Driven Prediction for Gestational Diabetes Mellitus in Early Pregnancy.


ABSTRACT: Gestational diabetes mellitus (GDM) is conventionally confirmed with oral glucose tolerance test (OGTT) in 24 to 28 weeks of gestation, but it is still uncertain whether it can be predicted with secondary use of electronic health records (EHRs) in early pregnancy. To this purpose, the cost-sensitive hybrid model (CSHM) and five conventional machine learning methods are used to construct the predictive models, capturing the future risks of GDM in the temporally aggregated EHRs. The experimental data sources from a nested case-control study cohort, containing 33,935 gestational women in West China Second Hospital. After data cleaning, 4,378 cases and 50 attributes are stored and collected for the data set. Through selecting the most feasible method, the cost parameter of CSHM is adapted to deal with imbalance of the dataset. In the experiment, 3940 samples are used for training and the rest 438 samples for testing. Although the accuracy of positive samples is barely acceptable (62.16%), the results suggest that the vast majority (98.4%) of those predicted positive instances are real positives. To our knowledge, this is the first study to apply machine learning models with EHRs to predict GDM, which will facilitate personalized medicine in maternal health management in the future.

SUBMITTER: Qiu H 

PROVIDER: S-EPMC5703904 | biostudies-other | 2017 Nov

REPOSITORIES: biostudies-other

altmetric image

Publications

Electronic Health Record Driven Prediction for Gestational Diabetes Mellitus in Early Pregnancy.

Qiu Hang H   Yu Hai-Yan HY   Wang Li-Ya LY   Yao Qiang Q   Wu Si-Nan SN   Yin Can C   Fu Bo B   Zhu Xiao-Juan XJ   Zhang Yan-Long YL   Xing Yong Y   Deng Jun J   Yang Hao H   Lei Shun-Dong SD  

Scientific reports 20171127 1


Gestational diabetes mellitus (GDM) is conventionally confirmed with oral glucose tolerance test (OGTT) in 24 to 28 weeks of gestation, but it is still uncertain whether it can be predicted with secondary use of electronic health records (EHRs) in early pregnancy. To this purpose, the cost-sensitive hybrid model (CSHM) and five conventional machine learning methods are used to construct the predictive models, capturing the future risks of GDM in the temporally aggregated EHRs. The experimental d  ...[more]

Similar Datasets

| S-EPMC6990028 | biostudies-literature
| S-EPMC9022440 | biostudies-literature
| S-EPMC7269799 | biostudies-literature
| S-EPMC5860043 | biostudies-literature
| S-EPMC5378599 | biostudies-literature
| S-EPMC6463614 | biostudies-literature
| S-EPMC8003050 | biostudies-literature
| S-EPMC5664285 | biostudies-literature
| S-EPMC6080723 | biostudies-literature
| S-EPMC6204222 | biostudies-literature