Unknown

Dataset Information

0

SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association.


ABSTRACT: Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of parameters, thus potentially losing power. Clustering algorithms, such as the cladistic approach, have been proposed to reduce the dimensionality, yet they have important limitations. We propose a SNP-Haplotype Adaptive REgression (SHARE) algorithm that seeks the most informative set of SNPs for genetic association in a targeted candidate region by growing and shrinking haplotypes with 1 more or less SNP in a stepwise fashion, and comparing prediction errors of different models via cross-validation. Depending on the evolutionary history of the disease mutations and the markers, this set may contain a single SNP or several SNPs that lay a foundation for haplotype analyses. Haplotype phase ambiguity is effectively accounted for by treating haplotype reconstruction as a part of the learning procedure. Simulations and a data application show that our method has improved power over existing methodologies and that the results are informative in the search for disease-causal loci.

SUBMITTER: Dai JY 

PROVIDER: S-EPMC2742496 | biostudies-literature | 2009 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association.

Dai James Y JY   Leblanc Michael M   Smith Nicholas L NL   Psaty Bruce B   Kooperberg Charles C  

Biostatistics (Oxford, England) 20090715 4


Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of parameters, thus potentially losing power. Clustering algorithms, such as the cladistic approach, have been proposed to reduce the dimensionality, yet they have important limitations. We propose a SNP-Hap  ...[more]

Similar Datasets

2012-01-11 | GSE34945 | GEO
2012-01-11 | E-GEOD-34945 | biostudies-arrayexpress
| S-EPMC10687199 | biostudies-literature
| S-EPMC5560484 | biostudies-other
| S-EPMC3326652 | biostudies-literature
| S-EPMC3476707 | biostudies-literature
| S-EPMC6056425 | biostudies-literature
| S-EPMC6233883 | biostudies-literature
| S-EPMC7071008 | biostudies-literature
| S-EPMC5685491 | biostudies-literature