Unknown

Dataset Information

0

A combined functional annotation score for non-synonymous variants.


ABSTRACT: Next-generation sequencing has opened the possibility of large-scale sequence-based disease association studies. A major challenge in interpreting whole-exome data is predicting which of the discovered variants are deleterious or neutral. To address this question in silico, we have developed a score called Combined Annotation scoRing toOL (CAROL), which combines information from 2 bioinformatics tools: PolyPhen-2 and SIFT, in order to improve the prediction of the effect of non-synonymous coding variants.We used a weighted Z method that combines the probabilistic scores of PolyPhen-2 and SIFT. We defined 2 dataset pairs to train and test CAROL using information from the dbSNP: 'HGMD-PUBLIC' and 1000 Genomes Project databases. The training pair comprises a total of 980 positive control (disease-causing) and 4,845 negative control (non-disease-causing) variants. The test pair consists of 1,959 positive and 9,691 negative controls.CAROL has higher predictive power and accuracy for the effect of non-synonymous variants than each individual annotation tool (PolyPhen-2 and SIFT) and benefits from higher coverage.The combination of annotation tools can help improve automated prediction of whole-genome/exome non-synonymous variant functional consequences.

SUBMITTER: Lopes MC 

PROVIDER: S-EPMC3390741 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

altmetric image

Publications

A combined functional annotation score for non-synonymous variants.

Lopes Margarida C MC   Joyce Chris C   Ritchie Graham R S GR   John Sally L SL   Cunningham Fiona F   Asimit Jennifer J   Zeggini Eleftheria E  

Human heredity 20120118 1


<h4>Aims</h4>Next-generation sequencing has opened the possibility of large-scale sequence-based disease association studies. A major challenge in interpreting whole-exome data is predicting which of the discovered variants are deleterious or neutral. To address this question in silico, we have developed a score called Combined Annotation scoRing toOL (CAROL), which combines information from 2 bioinformatics tools: PolyPhen-2 and SIFT, in order to improve the prediction of the effect of non-syno  ...[more]

Similar Datasets

| S-EPMC5388428 | biostudies-literature
| S-EPMC4086131 | biostudies-literature
| S-EPMC9375913 | biostudies-literature
| S-EPMC4013067 | biostudies-literature
| S-EPMC10081529 | biostudies-literature
| S-EPMC5015703 | biostudies-literature
| S-EPMC6791167 | biostudies-literature
| S-EPMC3387200 | biostudies-literature
| S-EPMC6978412 | biostudies-literature
| S-EPMC2981684 | biostudies-literature