Dataset Information

Testing the additional predictive value of high-dimensional molecular data.

ABSTRACT:

Background

While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature.

Results

We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to the two publicly available cancer data sets.

Conclusions

Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available. It is implemented in the R package "globalboosttest" which is publicly available from R-forge and will be sent to the CRAN as soon as possible.

SUBMITTER: Boulesteix AL

PROVIDER: S-EPMC2837029 | biostudies-literature | 2010 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Testing the additional predictive value of high-dimensional molecular data.

Boulesteix Anne-Laure AL Hothorn Torsten T

BMC bioinformatics 20100208

<h4>Background</h4>While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature.<h4>Results</h4>We suggest an intuitive permutation-based testing procedure for assessing the additional predict ...[more]

PMID: 20144191

Similar Datasets

Project description:ImportanceMyelin oligodendrocyte glycoprotein-IgG1-associated disorder (MOGAD) is a distinct central nervous system-demyelinating disease. Positive results on MOG-IgG1 testing by live cell-based assays can confirm a MOGAD diagnosis, but false-positive results may occur.ObjectiveTo determine the positive predictive value (PPV) of MOG-IgG1 testing in a tertiary referral center.Design, setting, and participantsThis diagnostic study was conducted over 2 years, from January 1, 2018, through December 31, 2019. Patients in the Mayo Clinic who were consecutively tested for MOG-IgG1 by live cell-based flow cytometry during their diagnostic workup were included. Patients without research authorization were excluded.Main outcomes and measuresMedical records of patients who were tested were initially reviewed by 2 investigators blinded to MOG-IgG1 serostatus, and pretest probability was classified as high or low (suggestive of MOGAD or not). Testing of MOG-IgG1 used a live-cell fluorescence-activated cell-sorting assay; an IgG binding index value of 2.5 or more with an end titer of 1:20 or more was considered positive. Cases positive for MOG-IgG1 were independently designated by 2 neurologists as true-positive or false-positive results at last follow-up, based on current international recommendations on diagnosis or identification of alternative diagnoses; consensus was reached for cases in which disagreement existed.ResultsA total of 1617 patients were tested, and 357 were excluded. Among 1260 included patients tested over 2 years, the median (range) age at testing was 46 (0-98) years, and 792 patients were female (62.9%). A total of 92 of 1260 (7.3%) were positive for MOG-IgG1. Twenty-six results (28%) were designated as false positive by the 2 raters, with an overall agreement on 91 of 92 cases (99%) for true and false positivity. Alternative diagnoses included multiple sclerosis (n = 11), infarction (n = 3), B12 deficiency (n = 2), neoplasia (n = 2), genetically confirmed adrenomyeloneuropathy (n = 1), and other conditions (n = 7). The overall PPV (number of true-positive results/total positive results) was 72% (95% CI, 62%-80%) and titer dependent (PPVs: 1:1000, 100%; 1:100, 82%; 1:20-40, 51%). The median titer was higher with true-positive results (1:100 [range, 1:20-1:10000]) than false-positive results (1:40 [range, 1:20-1:100]; P < .001). The PPV was higher for children (94% [95% CI, 72%-99%]) vs adults (67% [95% CI, 56%-77%]) and patients with high pretest probability (85% [95% CI, 76%-92%]) vs low pretest probability (12% [95% CI, 3%-34%]). The specificity of MOG-IgG1 testing was 97.8%.Conclusions and relevanceThis study confirms MOG-IgG1 as a highly specific biomarker for MOGAD, but when using a cutoff of 1:20, it has a low PPV of 72%. Caution is advised in the interpretation of low titers among patients with atypical phenotypes, because ordering MOG-IgG1 in low pretest probability situations will increase the proportion of false-positive results.

Dataset Information

Testing the additional predictive value of high-dimensional molecular data.

Background

Results

Conclusions

Publications

Testing the additional predictive value of high-dimensional molecular data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets