Unknown

Dataset Information

0

Accounting for undetected compounds in statistical analyses of mass spectrometry 'omic studies.


ABSTRACT: Mass spectrometry is an important high-throughput technique for profiling small molecular compounds in biological samples and is widely used to identify potential diagnostic and prognostic compounds associated with disease. Commonly, this data generated by mass spectrometry has many missing values resulting when a compound is absent from a sample or is present but at a concentration below the detection limit. Several strategies are available for statistically analyzing data with missing values. The accelerated failure time (AFT) model assumes all missing values result from censoring below a detection limit. Under a mixture model, missing values can result from a combination of censoring and the absence of a compound. We compare power and estimation of a mixture model to an AFT model. Based on simulated data, we found the AFT model to have greater power to detect differences in means and point mass proportions between groups. However, the AFT model yielded biased estimates with the bias increasing as the proportion of observations in the point mass increased while estimates were unbiased with the mixture model except if all missing observations came from censoring. These findings suggest using the AFT model for hypothesis testing and mixture model for estimation. We demonstrated this approach through application to glycomics data of serum samples from women with ovarian cancer and matched controls.

SUBMITTER: Taylor SL 

PROVIDER: S-EPMC3905689 | biostudies-literature | 2013 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accounting for undetected compounds in statistical analyses of mass spectrometry 'omic studies.

Taylor Sandra L SL   Leiserowitz Gary S GS   Kim Kyoungmi K  

Statistical applications in genetics and molecular biology 20131201 6


Mass spectrometry is an important high-throughput technique for profiling small molecular compounds in biological samples and is widely used to identify potential diagnostic and prognostic compounds associated with disease. Commonly, this data generated by mass spectrometry has many missing values resulting when a compound is absent from a sample or is present but at a concentration below the detection limit. Several strategies are available for statistically analyzing data with missing values.  ...[more]

Similar Datasets

| S-EPMC5920534 | biostudies-literature
| S-EPMC5101133 | biostudies-literature
| S-EPMC6693809 | biostudies-other
2005-09-20 | GSE2744 | GEO
| S-EPMC10831795 | biostudies-literature
| S-EPMC6007079 | biostudies-literature
| S-EPMC3489540 | biostudies-literature
| S-EPMC4341067 | biostudies-literature
| S-EPMC8959065 | biostudies-literature
| S-EPMC7602401 | biostudies-literature