Dataset Information

Limitations of the human reference genome for personalized genomics.

ABSTRACT: Data from the 1000 genomes project (1KGP) and Complete Genomics (CG) have dramatically increased the numbers of known genetic variants and challenge several assumptions about the reference genome and its uses in both clinical and research settings. Specifically, 34% of published array-based GWAS studies for a variety of diseases utilize probes that overlap unanticipated single nucleotide polymorphisms (SNPs), indels, or structural variants. Linkage disequilibrium (LD) block length depends on the numbers of markers used, and the mean LD block size decreases from 16 kb to 7 kb,when HapMap-based calculations are compared to blocks computed from1KGP data. Additionally, when 1KGP and CG variants are compared, 19% of the single nucleotide variants (SNVs) reported from common genomes are unique to one dataset; likely a result of differences in data collection methodology, alignment of reads to the reference genome, and variant-calling algorithms. Together these observations indicate that current research resources and informatics methods do not adequately account for the high level of variation that already exists in the human population and significant efforts are needed to create resources that can accurately assess personal genomics for health, disease, and predict treatment outcomes.

SUBMITTER: Rosenfeld JA

PROVIDER: S-EPMC3394790 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Limitations of the human reference genome for personalized genomics.

Rosenfeld Jeffrey A JA Mason Christopher E CE Smith Todd M TM

PloS one 20120711 7

Data from the 1000 genomes project (1KGP) and Complete Genomics (CG) have dramatically increased the numbers of known genetic variants and challenge several assumptions about the reference genome and its uses in both clinical and research settings. Specifically, 34% of published array-based GWAS studies for a variety of diseases utilize probes that overlap unanticipated single nucleotide polymorphisms (SNPs), indels, or structural variants. Linkage disequilibrium (LD) block length depends on the ...[more]

PMID: 22811759

Dataset Information

Limitations of the human reference genome for personalized genomics.

Publications

Limitations of the human reference genome for personalized genomics.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Genome assembly of the JD17 soybean provides a new reference genome for comparative genomics.
| S-EPMC8982393 | biostudies-literature

A chromosome-level reference genome and pangenome for barn swallow population genomics.
| S-EPMC10044405 | biostudies-literature

hg19KIndel: ethnicity normalized human reference genome.
| S-EPMC6555027 | biostudies-literature

Personalized medicine: new genomics, old lessons.
| S-EPMC3128266 | biostudies-literature

An improved pig reference genome sequence to enable pig genetics and genomics research.
| S-EPMC7448572 | biostudies-literature

A draft reference genome of the red abalone, Haliotis rufescens, for conservation genomics.
| S-EPMC9709998 | biostudies-literature

Recovery of non-reference sequences missing from the human reference genome.
| S-EPMC6796347 | biostudies-literature

The non-human primate reference transcriptome resource (NHPRTR) for comparative functional genomics.
| S-EPMC3531109 | biostudies-literature

Personalized genome assembly for accurate cancer somatic mutation discovery using tumor-normal paired reference samples.
| S-EPMC9648002 | biostudies-literature

Preserving biological heterogeneity with personalized genomics batch correction
2016-12-12 | GSE53355 | GEO