Unknown

Dataset Information

0

Single nucleotide polymorphism (SNP)-strings: an alternative method for assessing genetic associations.


ABSTRACT: BACKGROUND: Genome-wide association studies (GWAS) identify disease-associations for single-nucleotide-polymorphisms (SNPs) from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed odds-ratio for disease-association. METHODOLOGY/PRINCIPAL FINDINGS: Here we develop a method to identify the two SNP-haplotypes, which combine to produce each person's SNP-genotype over specified chromosomal segments. Two multiple sclerosis (MS)-associated genetic regions were modeled; DRB1 (a Class II molecule of the major histocompatibility complex) and MMEL1 (an endopeptidase that degrades both neuropeptides and ?-amyloid). For each locus, we considered sets of eleven adjacent SNPs, surrounding the putative disease-associated gene and spanning ?200 kb of DNA. The SNP-information was converted into an ordered-set of eleven-numbers (subject-vectors) based on whether a person had zero, one, or two copies of particular SNP-variant at each sequential SNP-location. SNP-strings were defined as those ordered-combinations of eleven-numbers (0 or 1), representing a haplotype, two of which combined to form the observed subject-vector. Subject-vectors were resolved using probabilistic methods. In both regions, only a small number of SNP-strings were present. We compared our method to the SHAPEIT-2 phasing-algorithm. When the SNP-information spanning 200 kb was used, SHAPEIT-2 was inaccurate. When the SHAPEIT-2 window was increased to 2,000 kb, the concordance between the two methods, in both of these eleven-SNP regions, was over 99%, suggesting that, in these regions, both methods were quite accurate. Nevertheless, correspondence was not uniformly high over the entire DNA-span but, rather, was characterized by alternating peaks and valleys of concordance. Moreover, in the valleys of poor-correspondence, SHAPEIT-2 was also inconsistent with itself, suggesting that the SNP-string method is more accurate across the entire region. CONCLUSIONS/SIGNIFICANCE: Accurate haplotype identification will enhance the detection of genetic-associations. The SNP-string method provides a simple means to accomplish this and can be extended to cover larger genomic regions, thereby improving a GWAS's power, even for those published previously.

SUBMITTER: Goodin DS 

PROVIDER: S-EPMC3984082 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Single nucleotide polymorphism (SNP)-strings: an alternative method for assessing genetic associations.

Goodin Douglas S DS   Khankhanian Pouya P  

PloS one 20140411 4


<h4>Background</h4>Genome-wide association studies (GWAS) identify disease-associations for single-nucleotide-polymorphisms (SNPs) from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed odds-ratio for disease-association.<h4>Methodology/principal findings</h4>Here we develop a method to identify the two SNP-haplotypes, which combine to produce each person's SNP-geno  ...[more]

Similar Datasets

| 2135070 | ecrin-mdr-crc
| S-EPMC3152640 | biostudies-literature
| S-EPMC3916134 | biostudies-literature
| S-EPMC7230726 | biostudies-literature
| S-EPMC4532366 | biostudies-literature
| S-EPMC4983672 | biostudies-literature
| S-EPMC3896655 | biostudies-literature
| S-EPMC2664151 | biostudies-literature
| S-EPMC3488013 | biostudies-literature
| S-EPMC4287849 | biostudies-literature