Unknown

Dataset Information

0

Improved imputation accuracy in Hispanic/Latino populations with larger and more diverse reference panels: applications in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL).


ABSTRACT: Imputation is commonly used in genome-wide association studies to expand the set of genetic variants available for analysis. Larger and more diverse reference panels, such as the final Phase 3 of the 1000 Genomes Project, hold promise for improving imputation accuracy in genetically diverse populations such as Hispanics/Latinos in the USA. Here, we sought to empirically evaluate imputation accuracy when imputing to a 1000 Genomes Phase 3 versus a Phase 1 reference, using participants from the Hispanic Community Health Study/Study of Latinos. Our assessments included calculating the correlation between imputed and observed allelic dosage in a subset of samples genotyped on a supplemental array. We observed that the Phase 3 reference yielded higher accuracy at rare variants, but that the two reference panels were comparable at common variants. At a sample level, the Phase 3 reference improved imputation accuracy in Hispanic/Latino samples from the Caribbean more than for Mainland samples, which we attribute primarily to the additional reference panel samples available in Phase 3. We conclude that a 1000 Genomes Project Phase 3 reference panel can yield improved imputation accuracy compared with Phase 1, particularly for rare variants and for samples of certain genetic ancestry compositions. Our findings can inform imputation design for other genome-wide association studies of participants with diverse ancestries, especially as larger and more diverse reference panels continue to become available.

SUBMITTER: Nelson SC 

PROVIDER: S-EPMC5179925 | biostudies-literature | 2016 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improved imputation accuracy in Hispanic/Latino populations with larger and more diverse reference panels: applications in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL).

Nelson Sarah C SC   Stilp Adrienne M AM   Papanicolaou George J GJ   Taylor Kent D KD   Rotter Jerome I JI   Thornton Timothy A TA   Laurie Cathy C CC  

Human molecular genetics 20160626 15


Imputation is commonly used in genome-wide association studies to expand the set of genetic variants available for analysis. Larger and more diverse reference panels, such as the final Phase 3 of the 1000 Genomes Project, hold promise for improving imputation accuracy in genetically diverse populations such as Hispanics/Latinos in the USA. Here, we sought to empirically evaluate imputation accuracy when imputing to a 1000 Genomes Phase 3 versus a Phase 1 reference, using participants from the Hi  ...[more]

Similar Datasets

| S-EPMC4889649 | biostudies-literature
| S-EPMC4618246 | biostudies-literature
| S-EPMC5634934 | biostudies-literature
| S-EPMC5752627 | biostudies-literature
| S-EPMC5399610 | biostudies-literature
| S-EPMC5479416 | biostudies-literature
| S-EPMC8417926 | biostudies-literature
| S-EPMC6656822 | biostudies-literature
| PRJNA263099 | ENA
| S-EPMC5756130 | biostudies-literature