Unknown

Dataset Information

0

A hierarchical Bayesian network approach for linkage disequilibrium modeling and data-dimensionality reduction prior to genome-wide association studies.


ABSTRACT:

Background

Discovering the genetic basis of common genetic diseases in the human genome represents a public health issue. However, the dimensionality of the genetic data (up to 1 million genetic markers) and its complexity make the statistical analysis a challenging task.

Results

We present an accurate modeling of dependences between genetic markers, based on a forest of hierarchical latent class models which is a particular class of probabilistic graphical models. This model offers an adapted framework to deal with the fuzzy nature of linkage disequilibrium blocks. In addition, the data dimensionality can be reduced through the latent variables of the model which synthesize the information borne by genetic markers. In order to tackle the learning of both forest structure and probability distributions, a generic algorithm has been proposed. A first implementation of our algorithm has been shown to be tractable on benchmarks describing 105 variables for 2000 individuals.

Conclusions

The forest of hierarchical latent class models offers several advantages for genome-wide association studies: accurate modeling of linkage disequilibrium, flexible data dimensionality reduction and biological meaning borne by latent variables.

SUBMITTER: Mourad R 

PROVIDER: S-EPMC3033325 | biostudies-literature | 2011 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

A hierarchical Bayesian network approach for linkage disequilibrium modeling and data-dimensionality reduction prior to genome-wide association studies.

Mourad Raphaël R   Sinoquet Christine C   Leray Philippe P  

BMC bioinformatics 20110112


<h4>Background</h4>Discovering the genetic basis of common genetic diseases in the human genome represents a public health issue. However, the dimensionality of the genetic data (up to 1 million genetic markers) and its complexity make the statistical analysis a challenging task.<h4>Results</h4>We present an accurate modeling of dependences between genetic markers, based on a forest of hierarchical latent class models which is a particular class of probabilistic graphical models. This model offe  ...[more]

Similar Datasets

| S-EPMC2918651 | biostudies-literature
| S-EPMC379228 | biostudies-literature
| S-EPMC6637384 | biostudies-literature
| S-EPMC11245513 | biostudies-literature
| S-EPMC3931163 | biostudies-literature
| S-EPMC6822679 | biostudies-literature
| S-EPMC3897757 | biostudies-literature
| S-EPMC4172344 | biostudies-literature
| S-EPMC5322361 | biostudies-literature
| S-EPMC2765270 | biostudies-literature