Project description:Chemical RNA modifications, collectively referred to as the ‘epitranscriptome’, have been intensively studied during the last years, largely facilitated by the use of next-generation sequencing technologies. Recent efforts have turned towards the nanopore direct RNA sequencing (DRS) platform, as it allows simultaneous detection of diverse RNA modification types in full-length native RNA molecules. While RNA modifications can be identified in the form of systematic basecalling ‘errors’ in DRS datasets, m6A modifications produce very modest ‘errors’, limiting the applicability of this approach to sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully-unmodified in vitro synthetic sequences, increase the ‘error’ signal of m6A modifications, leading to enhanced detection of RNA modifications even at lower stoichiometries. We then show that the use of these models enhances the detection of RNA modifications on previously published in vivo human samples, using third-party softwares for the detection of RNA modifications. Moreover, our work provides a novel RNA basecalling model that shows a median accuracy of 97%, compared to previously available RNA basecalling models that show 91% accuracy. Notably, this increase in accuracy does not only lead to improved detection of RNA modifications, but also enhanced mappability of RNA reads, which becomes more evident in the case of short RNA reads (50% increase). Altogether, our work stresses the importance of using fully unmodified RNA sequences for training RNA basecalling models, and how the use of different basecalling models can significantly affect the detection of RNA modifications and read mappability.
Project description:State-of-the-art algorithms for m6A detection and quantification via nanopore direct RNA sequencing have been continuously developed, little is known about their capacities and limitations, which makes a comprehensive assessment in urgent need. Therefore, we performed comprehensive benchmarking of 10 computational tools relying on current-based and base-calling “errors” strategies for m6A detection by nanopore sequencing.
Project description:N6-methyladenosine (m6A) and pseudouridine (Ψ) are the two most abundant modifications in mammalian mRNA, but the coordination of their biological functions remains poorly understood. We develop a machine learning-based nanopore direct RNA sequencing method (NanoSPA) that simultaneously analyzes m6A and Ψ in the human transcriptome. Applying NanoSPA to polysome profiling, we reveal opposing transcriptomic co-occurrence of m6A and Ψ and synergistic, hierarchical effects of m6A and Ψ on the polysome.
Project description:We report the direct RNA sequencing of HEK293 and a primary human mammary epithelial cell (HMEC) line using Oxford Nanopore based sequencing. Using this data, we built an algorithm to detect m6A modifications within the DRACH motif context. Evaluation of m6A sites was carried out with HEK METTL3 knockdown and HMEC ALKBH5 over expression cell lines.
Project description:In this study, based on Nanopore direct RNA-seq where native RNAs are sequenced directly as near full-length transcripts in the 3' to 5' direction, transcription units of the phytopathogen Dickeya dadantii 3937 were validated and transcriptional termination sites were determined. Briefly, D. dadantii cultures were grown in M63 medium supplemented with 0.2% glucose and 0.2% PGA, until the early exponential phase (A600nm = 0.2, condition 1), or the early stationary phase (A600nm = 1.8, condition 2). RNAs were extracted using a frozen acid-phenol method, as previously described (Hommais et al. 2008), and treated successively with Roche and Biolabs DNases. Two samples were prepared: 50 µg of RNAs from each condition were pulled into one sample (sample 1), whereas the other one contained 100 µg of RNAs from condition 2 (sample 2). Both samples were then supplied to Vertis Biotechnologie AG for Nanopore native RNA-seq: total RNA preparations were first examined by capillary electrophoresis. For sample 1, ribosomal RNA molecules were depleted using an in-house developed protocol (recovery rate = 84%), whereas no ribodepletion was performed for sample 2. 3' ends of RNA were then poly(A)-tailed using poly(A) polymerase, and the Direct RNA sequencing kit (SQK-RNA002) was used to prepare the library for 1D sequencing on the Oxford Nanopore sequencing device. The direct RNA libraries were sequenced on a MinION device (MIN-101B) using standard settings. Basecalling of the fast5 files was performed using Guppy (version 3.6.1) with the following settings: --flowcell FLO-MIN106 --kit SQK-RNA002 --cpu_threads_per_caller 12--compress_fastq --reverse_sequence true --trim_strategy rna. Reads smaller than 50 nt were removed. 466 393 and 556 850 reads were generated for sample 1 and 2, respectively.