Unknown

Dataset Information

0

A Machine Learning Classifier for Assigning Individual Patients With Systemic Sclerosis to Intrinsic Molecular Subsets.


ABSTRACT: OBJECTIVE:High-throughput gene expression profiling of tissue samples from patients with systemic sclerosis (SSc) has identified 4 "intrinsic" gene expression subsets: inflammatory, fibroproliferative, normal-like, and limited. Prior methods required agglomerative clustering of many samples. In order to classify individual patients in clinical trials or for diagnostic purposes, supervised methods that can assign single samples to molecular subsets are required. We undertook this study to introduce a novel machine learning classifier as a robust accurate intrinsic subset predictor. METHODS:Three independent gene expression cohorts were curated and merged to create a data set covering 297 skin biopsy samples from 102 unique patients and controls, which was used to train a machine learning algorithm. We performed external validation using 3 independent SSc cohorts, including a gene expression data set generated by an independent laboratory on a different microarray platform. In total, 413 skin biopsy samples from 213 individuals were analyzed in the training and testing cohorts. RESULTS:Repeated cross-fold validation identified consistent and discriminative markers using multinomial elastic net, performing with an average classification accuracy of 87.1% with high sensitivity and specificity. In external validation, the classifier achieved an average accuracy of 85.4%. Reanalyzing data from a previous study, we identified subsets of patients that represent the canonical inflammatory, fibroproliferative, and normal-like subsets. CONCLUSION:We developed a highly accurate classifier for SSc molecular subsets for individual patient samples. The method can be used in SSc clinical trials to identify an intrinsic subset on individual samples. Our method provides a robust data-driven approach to aid clinical decision-making and interpretation of heterogeneous molecular information in SSc patients.

SUBMITTER: Franks JM 

PROVIDER: S-EPMC6764877 | biostudies-literature | 2019 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Machine Learning Classifier for Assigning Individual Patients With Systemic Sclerosis to Intrinsic Molecular Subsets.

Franks Jennifer M JM   Martyanov Viktor V   Cai Guoshuai G   Wang Yue Y   Li Zhenghui Z   Wood Tammara A TA   Whitfield Michael L ML  

Arthritis & rheumatology (Hoboken, N.J.) 20190902 10


<h4>Objective</h4>High-throughput gene expression profiling of tissue samples from patients with systemic sclerosis (SSc) has identified 4 "intrinsic" gene expression subsets: inflammatory, fibroproliferative, normal-like, and limited. Prior methods required agglomerative clustering of many samples. In order to classify individual patients in clinical trials or for diagnostic purposes, supervised methods that can assign single samples to molecular subsets are required. We undertook this study to  ...[more]

Similar Datasets

2019-01-19 | GSE125362 | GEO
| PRJNA515920 | ENA
| S-EPMC3326181 | biostudies-literature
| S-EPMC8563671 | biostudies-literature
| S-EPMC7233069 | biostudies-literature
2023-02-01 | GSE217067 | GEO
| S-EPMC7244534 | biostudies-literature
| S-EPMC6946916 | biostudies-literature
| S-EPMC8097070 | biostudies-literature
| S-EPMC5841386 | biostudies-literature