Unknown

Dataset Information

0

NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks.


ABSTRACT: Long-read sequencing enables variant detection in genomic regions that are considered difficult-to-map by short-read sequencing. To fully exploit the benefits of longer reads, here we present a deep learning method NanoCaller, which detects SNPs using long-range haplotype information, then phases long reads with called SNPs and calls indels with local realignment. Evaluation on 8 human genomes demonstrates that NanoCaller generally achieves better performance than competing approaches. We experimentally validate 41 novel variants in a widely used benchmarking genome, which could not be reliably detected previously. In summary, NanoCaller facilitates the discovery of novel variants in complex genomic regions from long-read sequencing.

SUBMITTER: Ahsan MU 

PROVIDER: S-EPMC8419925 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC3926339 | biostudies-literature
| S-EPMC10612404 | biostudies-literature
| S-EPMC10274712 | biostudies-literature
| S-EPMC8092372 | biostudies-literature
| S-EPMC2689609 | biostudies-literature
| S-EPMC7223266 | biostudies-literature
| S-EPMC3888126 | biostudies-literature
| S-EPMC10783491 | biostudies-literature
| S-EPMC7245042 | biostudies-literature
| S-EPMC5870570 | biostudies-literature