Unknown

Dataset Information

0

A multi-task convolutional deep neural network for variant calling in single molecule sequencing.


ABSTRACT: The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5-15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2?h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source ( https://github.com/aquaskyline/Clairvoyante ), with modules to train, utilize and visualize the model.

SUBMITTER: Luo R 

PROVIDER: S-EPMC6397153 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

A multi-task convolutional deep neural network for variant calling in single molecule sequencing.

Luo Ruibang R   Sedlazeck Fritz J FJ   Lam Tak-Wah TW   Schatz Michael C MC  

Nature communications 20190301 1


The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5-15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieve  ...[more]

Similar Datasets

| S-EPMC6909530 | biostudies-literature
| S-EPMC5549771 | biostudies-literature
2021-01-11 | GSE147113 | GEO
| S-EPMC6110828 | biostudies-other
| S-EPMC8389446 | biostudies-literature
| S-EPMC8654959 | biostudies-literature
| S-EPMC6722845 | biostudies-literature
| S-EPMC6025884 | biostudies-literature
| S-EPMC10333175 | biostudies-literature
| S-EPMC8061452 | biostudies-literature