Project description:State-of-the-art algorithms for m6A detection and quantification via nanopore direct RNA sequencing have been continuously developed, little is known about their capacities and limitations, which makes a comprehensive assessment in urgent need. Therefore, we performed comprehensive benchmarking of 10 computational tools relying on current-based and base-calling “errors” strategies for m6A detection by nanopore sequencing.
Project description:To detect the modifed bases in SINEUP RNA, we compared chemically modified in vitro transcribed (IVT) SINEUP-GFP RNA and in-cell transcribed (ICT) SINEUP RNA from SINEUP-GFP and sense EGFP co-transfected HEK293T/17 cells. Comparative study of Nanopore direct RNA sequencing data from non-modified and modified IVT samples against the data from ICT SINEUP RNA sample revealed modified k-mers positions in SINEUP RNA in the cell.
Project description:5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) are modified versions of cytosine in DNA with roles in regulating gene expression. Using whole genomic DNA from mouse cerebellum, we have benchmarked 5mC and 5hmC detection by Oxford Nanopore Technologies sequencing against other standard techniques. In addition, we assessed the ability of duplex base-calling to study strand asymmetric modification. Nanopore detection of 5mC and 5hmC is accurate relative to compared techniques and opens new means of studying these modifications. Strand asymmetric modification is widespread across the genome but reduced at imprinting control regions and CTCF binding sites in mouse cerebellum. This study demonstrates the unique ability of nanopore sequencing to improve the resolution and detail of cytosine modification mapping.
Project description:Chemical RNA modifications, collectively referred to as the ‘epitranscriptome’, have been intensively studied during the last years, largely facilitated by the use of next-generation sequencing technologies. Recent efforts have turned towards the nanopore direct RNA sequencing (DRS) platform, as it allows simultaneous detection of diverse RNA modification types in full-length native RNA molecules. While RNA modifications can be identified in the form of systematic basecalling ‘errors’ in DRS datasets, m6A modifications produce very modest ‘errors’, limiting the applicability of this approach to sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully-unmodified in vitro synthetic sequences, increase the ‘error’ signal of m6A modifications, leading to enhanced detection of RNA modifications even at lower stoichiometries. We then show that the use of these models enhances the detection of RNA modifications on previously published in vivo human samples, using third-party softwares for the detection of RNA modifications. Moreover, our work provides a novel RNA basecalling model that shows a median accuracy of 97%, compared to previously available RNA basecalling models that show 91% accuracy. Notably, this increase in accuracy does not only lead to improved detection of RNA modifications, but also enhanced mappability of RNA reads, which becomes more evident in the case of short RNA reads (50% increase). Altogether, our work stresses the importance of using fully unmodified RNA sequences for training RNA basecalling models, and how the use of different basecalling models can significantly affect the detection of RNA modifications and read mappability.