Unknown

Dataset Information

0

Better prediction of functional effects for sequence variants.


ABSTRACT: Elucidating the effects of naturally occurring genetic variation is one of the major challenges for personalized health and personalized medicine. Here, we introduce SNAP2, a novel neural network based classifier that improves over the state-of-the-art in distinguishing between effect and neutral variants. Our method's improved performance results from screening many potentially relevant protein features and from refining our development data sets. Cross-validated on >100k experimentally annotated variants, SNAP2 significantly outperformed other methods, attaining a two-state accuracy (effect/neutral) of 83%. SNAP2 also outperformed combinations of other methods. Performance increased for human variants but much more so for other organisms. Our method's carefully calibrated reliability index informs selection of variants for experimental follow up, with the most strongly predicted half of all effect variants predicted at over 96% accuracy. As expected, the evolutionary information from automatically generated multiple sequence alignments gave the strongest signal for the prediction. However, we also optimized our new method to perform surprisingly well even without alignments. This feature reduces prediction runtime by over two orders of magnitude, enables cross-genome comparisons, and renders our new method as the best solution for the 10-20% of sequence orphans. SNAP2 is available at: https://rostlab.org/services/snap2web.

SUBMITTER: Hecht M 

PROVIDER: S-EPMC4480835 | biostudies-other | 2015

REPOSITORIES: biostudies-other

altmetric image

Publications

Better prediction of functional effects for sequence variants.

Hecht Maximilian M   Bromberg Yana Y   Rost Burkhard B  

BMC genomics 20150618


Elucidating the effects of naturally occurring genetic variation is one of the major challenges for personalized health and personalized medicine. Here, we introduce SNAP2, a novel neural network based classifier that improves over the state-of-the-art in distinguishing between effect and neutral variants. Our method's improved performance results from screening many potentially relevant protein features and from refining our development data sets. Cross-validated on >100k experimentally annotat  ...[more]

Similar Datasets

| S-EPMC9890318 | biostudies-literature
| S-EPMC6078163 | biostudies-literature
| S-EPMC5015703 | biostudies-literature
| S-EPMC2654563 | biostudies-literature
| S-EPMC5470696 | biostudies-literature
| S-EPMC6881321 | biostudies-literature
| S-EPMC8929166 | biostudies-literature
| S-EPMC7919641 | biostudies-literature
| S-EPMC1277819 | biostudies-literature
| S-EPMC2951711 | biostudies-literature