Unknown

Dataset Information

0

Bayesian approach to transforming public gene expression repositories into disease diagnosis databases.


ABSTRACT: The rapid accumulation of gene expression data has offered unprecedented opportunities to study human diseases. The National Center for Biotechnology Information Gene Expression Omnibus is currently the largest database that systematically documents the genome-wide molecular basis of diseases. However, thus far, this resource has been far from fully utilized. This paper describes the first study to transform public gene expression repositories into an automated disease diagnosis database. Particularly, we have developed a systematic framework, including a two-stage Bayesian learning approach, to achieve the diagnosis of one or multiple diseases for a query expression profile along a hierarchical disease taxonomy. Our approach, including standardizing cross-platform gene expression data and heterogeneous disease annotations, allows analyzing both sources of information in a unified probabilistic system. A high level of overall diagnostic accuracy was shown by cross validation. It was also demonstrated that the power of our method can increase significantly with the continued growth of public gene expression repositories. Finally, we showed how our disease diagnosis system can be used to characterize complex phenotypes and to construct a disease-drug connectivity map.

SUBMITTER: Huang H 

PROVIDER: S-EPMC2872390 | biostudies-literature | 2010 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Bayesian approach to transforming public gene expression repositories into disease diagnosis databases.

Huang Haiyan H   Liu Chun-Chi CC   Zhou Xianghong Jasmine XJ  

Proceedings of the National Academy of Sciences of the United States of America 20100401 15


The rapid accumulation of gene expression data has offered unprecedented opportunities to study human diseases. The National Center for Biotechnology Information Gene Expression Omnibus is currently the largest database that systematically documents the genome-wide molecular basis of diseases. However, thus far, this resource has been far from fully utilized. This paper describes the first study to transform public gene expression repositories into an automated disease diagnosis database. Partic  ...[more]

Similar Datasets

| S-EPMC6980531 | biostudies-literature
| S-EPMC5587190 | biostudies-literature
2009-11-30 | GSE18619 | GEO
2010-05-19 | E-GEOD-18619 | biostudies-arrayexpress
| S-EPMC6317063 | biostudies-literature
| S-EPMC10798454 | biostudies-literature
| S-EPMC7418995 | biostudies-literature
| S-EPMC7307974 | biostudies-literature
| S-EPMC5613734 | biostudies-literature
| S-EPMC7116434 | biostudies-literature