Unknown

Dataset Information

0

Selecting Reliable mRNA Expression Measurements Across Platforms Improves Downstream Analysis.


ABSTRACT: With increasing use of publicly available gene expression data sets, the quality of the expression data is a critical issue for downstream analysis, gene signature development, and cross-validation of data sets. Thus, identifying reliable expression measurements by leveraging multiple mRNA expression platforms is an important analytical task. In this study, we propose a statistical framework for selecting reliable measurements between platforms by modeling the correlations of mRNA expression levels using a beta-mixture model. The model-based selection provides an effective and objective way to separate good probes from probes with low quality, thereby improving the efficiency and accuracy of the analysis. The proposed method can be used to compare two microarray technologies or microarray and RNA sequencing measurements. We tested the approach in two matched profiling data sets, using microarray gene expression measurements from the same samples profiled on both Affymetrix and Illumina platforms. We also applied the algorithm to mRNA expression data to compare Affymetrix microarray data with RNA sequencing measurements. The algorithm successfully identified probes/genes with reliable measurements. Removing the unreliable measurements resulted in significant improvements for gene signature development and functional annotations.

SUBMITTER: Tong P 

PROVIDER: S-EPMC4863871 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

Selecting Reliable mRNA Expression Measurements Across Platforms Improves Downstream Analysis.

Tong Pan P   Diao Lixia L   Shen Li L   Li Lerong L   Heymach John Victor JV   Girard Luc L   Minna John D JD   Coombes Kevin R KR   Byers Lauren Averett LA   Wang Jing J  

Cancer informatics 20160510


With increasing use of publicly available gene expression data sets, the quality of the expression data is a critical issue for downstream analysis, gene signature development, and cross-validation of data sets. Thus, identifying reliable expression measurements by leveraging multiple mRNA expression platforms is an important analytical task. In this study, we propose a statistical framework for selecting reliable measurements between platforms by modeling the correlations of mRNA expression lev  ...[more]

Similar Datasets

| S-EPMC206463 | biostudies-literature
| S-EPMC3661442 | biostudies-literature
2019-05-30 | GSE129556 | GEO
| S-EPMC2812950 | biostudies-literature
| S-EPMC8068083 | biostudies-literature
| S-EPMC3059153 | biostudies-literature