Unknown

Dataset Information

0

Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges.


ABSTRACT: The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.

SUBMITTER: Cai B 

PROVIDER: S-EPMC5645203 | biostudies-literature | 2017 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges.

Cai Binghuang B   Li Biao B   Kiga Nikki N   Thusberg Janita J   Bergquist Timothy T   Chen Yun-Ching YC   Niknafs Noushin N   Carter Hannah H   Tokheim Collin C   Beleva-Guthrie Violeta V   Douville Christopher C   Bhattacharya Rohit R   Yeo Hui Ting Grace HTG   Fan Jean J   Sengupta Sohini S   Kim Dewey D   Cline Melissa M   Turner Tychele T   Diekhans Mark M   Zaucha Jan J   Pal Lipika R LR   Cao Chen C   Yu Chen-Hsin CH   Yin Yizhou Y   Carraro Marco M   Giollo Manuel M   Ferrari Carlo C   Leonardi Emanuela E   Tosatto Silvio C E SCE   Bobe Jason J   Ball Madeleine M   Hoskins Roger A RA   Repo Susanna S   Church George G   Brenner Steven E SE   Moult John J   Gough Julian J   Stanke Mario M   Karchin Rachel R   Mooney Sean D SD  

Human mutation 20170619 9


The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genome  ...[more]

Similar Datasets

| S-EPMC3978420 | biostudies-literature
| S-EPMC3935288 | biostudies-literature
| S-EPMC8134793 | biostudies-literature
| S-EPMC7063715 | biostudies-literature
| S-EPMC6424514 | biostudies-literature
| S-EPMC6444358 | biostudies-literature
| S-EPMC7385432 | biostudies-literature
| S-EPMC7462465 | biostudies-literature
| S-EPMC6460651 | biostudies-literature
| S-EPMC6712681 | biostudies-literature