Dataset Information

The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

ABSTRACT: BACKGROUND:Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. FINDINGS:As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. CONCLUSIONS:These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

SUBMITTER: Mao Q

PROVIDER: S-EPMC5057367 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

Mao Qing Q Ciotlos Serban S Zhang Rebecca Yu RY Ball Madeleine P MP Chin Robert R Carnevali Paolo P Barua Nina N Nguyen Staci S Agarwal Misha R MR Clegg Tom T Connelly Abram A Vandewege Ward W Zaranek Alexander Wait AW Estep Preston W PW Church George M GM Drmanac Radoje R Peters Brock A BA

GigaScience 20161011 1

<h4>Background</h4>Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information.<h4>Findings</h4>As part of the P ...[more]

PMID: 27724973

Similar Datasets

Project description:BACKGROUND: Multiple laboratories now offer clinical whole genome sequencing (WGS). We anticipate WGS becoming routinely used in research and clinical practice. Many institutions are exploring how best to educate geneticists and other professionals about WGS. Providing students in WGS courses with the option to analyze their own genome sequence is one strategy that might enhance students' engagement and motivation to learn about personal genomics. However, if this option is presented to students, it is vital they make informed decisions, do not feel pressured into analyzing their own genomes by their course directors or peers, and feel free to analyze a third-party genome if they prefer. We therefore developed a 26-hour introductory genomics course in part to help students make informed decisions about whether to receive personal WGS data in a subsequent advanced genomics course. In the advanced course, they had the option to receive their own personal genome data, or an anonymous genome, at no financial cost to them. Our primary aims were to examine whether students made informed decisions regarding analyzing their personal genomes, and whether there was evidence that the introductory course enabled the students to make a more informed decision. METHODS: This was a longitudinal cohort study in which students (N?=?19) completed questionnaires assessing their intentions, informed decision-making, attitudes and knowledge before (T1) and after (T2) the introductory course, and before the advanced course (T3). Informed decision-making was assessed using the Decisional Conflict Scale. RESULTS: At the start of the introductory course (T1), most (17/19) students intended to receive their personal WGS data in the subsequent course, but many expressed conflict around this decision. Decisional conflict decreased after the introductory course (T2) indicating there was an increase in informed decision-making, and did not change before the advanced course (T3). This suggests that it was the introductory course content rather than simply time passing that had the effect. In the advanced course, all (19/19) students opted to receive their personal WGS data. No changes in technical knowledge of genomics were observed. Overall attitudes towards WGS were broadly positive. CONCLUSIONS: Providing students with intensive introductory education about WGS may help them make informed decisions about whether or not to work with their personal WGS data in an educational setting.

Dataset Information

The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

Publications

The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets