Unknown

Dataset Information

0

Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.


ABSTRACT: Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large p, small n' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package "adaptiveHM", which is freely available from https://github.com/benliemory/adaptiveHM.

SUBMITTER: Li B 

PROVIDER: S-EPMC5599104 | biostudies-literature | 2017 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

Li Ben B   Li Yunxiao Y   Qin Zhaohui S ZS  

Statistics in biosciences 20160708 1


Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large <i>p</i>, small <i>n</i>' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data.  ...[more]

Similar Datasets

| S-EPMC10037215 | biostudies-literature
| S-EPMC10442720 | biostudies-literature
| S-EPMC4468192 | biostudies-literature
| S-EPMC3905248 | biostudies-literature
| S-EPMC7168977 | biostudies-literature
| S-EPMC6002135 | biostudies-literature
| S-EPMC5913736 | biostudies-other
| S-EPMC10660288 | biostudies-literature
| S-EPMC4643634 | biostudies-literature
| S-EPMC4720014 | biostudies-literature