Unknown

Dataset Information

0

Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.


ABSTRACT: A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000 GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.

SUBMITTER: Delaneau O 

PROVIDER: S-EPMC4338501 | biostudies-literature | 2014 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.

Delaneau Olivier O   Marchini Jonathan J  

Nature communications 20140613


A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can  ...[more]

Similar Datasets

| S-EPMC4686825 | biostudies-literature
| S-EPMC5522380 | biostudies-literature
| S-EPMC5553676 | biostudies-literature
| S-EPMC5177868 | biostudies-literature
| PRJEB56604 | ENA
| S-EPMC5532257 | biostudies-literature
| S-EPMC4022254 | biostudies-literature
| S-EPMC4579394 | biostudies-literature
| PRJNA28889 | ENA
| S-EPMC5096458 | biostudies-literature