Ontology highlight
ABSTRACT: Summary
RNA sequencing (RNA-seq) can be applied to diverse tasks including quantifying gene expression, discovering quantitative trait loci and identifying gene fusion events. Although RNA-seq can detect germline variants, the complexities of variable transcript abundance, target capture and amplification introduce challenging sources of error. Here, we extend DeepVariant, a deep-learning-based variant caller, to learn and account for the unique challenges presented by RNA-seq data. Our DeepVariant RNA-seq model produces highly accurate variant calls from RNA-sequencing data, and outperforms existing approaches such as Platypus and GATK. We examine factors that influence accuracy, how our model addresses RNA editing events and how additional thresholding can be used to facilitate our models' use in a production pipeline.Supplementary information
Supplementary data are available at Bioinformatics Advances online.
SUBMITTER: Cook DE
PROVIDER: S-EPMC10320079 | biostudies-literature | 2023
REPOSITORIES: biostudies-literature
Cook Daniel E DE Venkat Aarti A Yelizarov Dennis D Pouliot Yannick Y Chang Pi-Chuan PC Carroll Andrew A De La Vega Francisco M FM
Bioinformatics advances 20230613 1
<h4>Summary</h4>RNA sequencing (RNA-seq) can be applied to diverse tasks including quantifying gene expression, discovering quantitative trait loci and identifying gene fusion events. Although RNA-seq can detect germline variants, the complexities of variable transcript abundance, target capture and amplification introduce challenging sources of error. Here, we extend DeepVariant, a deep-learning-based variant caller, to learn and account for the unique challenges presented by RNA-seq data. Our ...[more]