Project description:Chemical RNA modifications, collectively referred to as the ‘epitranscriptome’, have been intensively studied during the last years, largely facilitated by the use of next-generation sequencing technologies. Recent efforts have turned towards the nanopore direct RNA sequencing (DRS) platform, as it allows simultaneous detection of diverse RNA modification types in full-length native RNA molecules. While RNA modifications can be identified in the form of systematic basecalling ‘errors’ in DRS datasets, m6A modifications produce very modest ‘errors’, limiting the applicability of this approach to sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully-unmodified in vitro synthetic sequences, increase the ‘error’ signal of m6A modifications, leading to enhanced detection of RNA modifications even at lower stoichiometries. We then show that the use of these models enhances the detection of RNA modifications on previously published in vivo human samples, using third-party softwares for the detection of RNA modifications. Moreover, our work provides a novel RNA basecalling model that shows a median accuracy of 97%, compared to previously available RNA basecalling models that show 91% accuracy. Notably, this increase in accuracy does not only lead to improved detection of RNA modifications, but also enhanced mappability of RNA reads, which becomes more evident in the case of short RNA reads (50% increase). Altogether, our work stresses the importance of using fully unmodified RNA sequences for training RNA basecalling models, and how the use of different basecalling models can significantly affect the detection of RNA modifications and read mappability.
Project description:State-of-the-art algorithms for m6A detection and quantification via nanopore direct RNA sequencing have been continuously developed, little is known about their capacities and limitations, which makes a comprehensive assessment in urgent need. Therefore, we performed comprehensive benchmarking of 10 computational tools relying on current-based and base-calling “errors” strategies for m6A detection by nanopore sequencing.
Project description:N6-methyladenosine (m6A) and pseudouridine (Ψ) are the two most abundant modifications in mammalian mRNA, but the coordination of their biological functions remains poorly understood. We develop a machine learning-based nanopore direct RNA sequencing method (NanoSPA) that simultaneously analyzes m6A and Ψ in the human transcriptome. Applying NanoSPA to polysome profiling, we reveal opposing transcriptomic co-occurrence of m6A and Ψ and synergistic, hierarchical effects of m6A and Ψ on the polysome.
Project description:We report the direct RNA sequencing of HEK293 and a primary human mammary epithelial cell (HMEC) line using Oxford Nanopore based sequencing. Using this data, we built an algorithm to detect m6A modifications within the DRACH motif context. Evaluation of m6A sites was carried out with HEK METTL3 knockdown and HMEC ALKBH5 over expression cell lines.
Project description:Long-read nanopore sequencing has emerged as a potent tool for studying RNA modifications. However, the detection of N4-acetylcytidine (ac4C) based on nanopore sequencing remains largely unexplored. Here, we introduce ac4Cnet, a deep learning frame utilizing Oxford Nanopore direct RNA sequencing to accurately identify ac4C sites. Our methodology involves training ac4Cnet capable of distinguishing ac4C from unmodified cytidine and 5-methylcytosine (m5C), as well as estimating the modification rate at each ac4C site. We demonstrate the robustness of our approach through validations on independent in vitro datasets and a human cell line, highlighting its versatility and potential for advancing the study of ac4C modifications.
Project description:N6-methyladenosine (m6A) has been one of the most abundant and well-known modifications in mRNA since its discovery in 1970s. Recent studies have demonstrated that m6A gets involved in various biological processes such as alternative splicing and RNA degradation, playing an important role in all kinds of diseases. To better understand the role of m6A, transcriptome-wide m6A profiling data is indispensable. In these years, the Oxford Nanopore Technology Direct RNA Sequencing (DRS) platform has shown promise in RNA modification detection based on current disruptions measured in transcripts. However, decoding current intensity data into modification profiles remains a challenging task. Here, we introduce m6A Transcriptome-wide Mapper (m6ATM), a novel Python-based computational pipeline that applies deep neural networks to predict m6A sites at single-base resolution using DRS data. The m6ATM model architecture incorporates a WaveNet encoder and a dual-stream multiple instance learning model to extract features from specific target sites and characterize the m6A epitranscriptome. For validation, m6ATM achieved an accuracy of 80 to 98% across in-vitro transcription datasets containing varying m6A modification ratios and outperformed other tools in benchmarking with human cell-line data. Moreover, we demonstrated the versatility of m6ATM in providing reliable stoichiometric information and used it to pinpoint PEG10 as a potential m6A target transcript in liver cancer cells. In conclusion, we showed that m6ATM is a high-performance m6A detection tool and our results paved the way for epitranscriptomic precision medicine.
Project description:RNA internal modifications play critical role in development of multicellular organisms and their response to environmental cues. Using nanopore direct RNA sequencing (DRS), we constructed a large in vitro epitranscriptome (IVET) resource from plant cDNA library labeled with m6A, m1A and m5C respectively. Furthermore, after transfer learning, the pre-trained model was used to detect additional RNA internal modification such as m1A, hm5C, m7G and Ψ modification. Finally, we illustrated a global view of epitranscriptome with m6A, m1A, m5C, m7G and Ψ modification in rice seedlings under normal and high salinity environment. In summary, we provided a strategy for creating IVET resource from cDNA library and developed a computational method that use IVET-based transfer learning termed TandemMod for profiling epitranscriptome landscape with co-occupancy of multiple types of RNA modification in plants responsive to environmental signal.
Project description:N6-methyladenosine (m6A) has been one of the most abundant and well-known modifications in mRNA since its discovery in 1970s. Recent studies have demonstrated that m6A gets involved in various biological processes such as alternative splicing and RNA degradation, playing an important role in all kinds of diseases. To better understand the role of m6A, transcriptome-wide m6A profiling data is indispensable. In these years, the Oxford Nanopore Technology Direct RNA Sequencing (DRS) platform has shown promise in RNA modification detection based on current disruptions measured in transcripts. However, decoding current intensity data into modification profiles remains a challenging task. Here, we introduce m6A Transcriptome-wide Mapper (m6ATM), a novel Python-based computational pipeline that applies deep neural networks to predict m6A sites at single-base resolution using DRS data. The m6ATM model architecture incorporates a WaveNet encoder and a dual-stream multiple instance learning model to extract features from specific target sites and characterize the m6A epitranscriptome. For validation, m6ATM achieved an accuracy of 80 to 98% across in-vitro transcription datasets containing varying m6A modification ratios and outperformed other tools in benchmarking with human cell-line data. Moreover, we demonstrated the versatility of m6ATM in providing reliable stoichiometric information and used it to pinpoint PEG10 as a potential m6A target transcript in liver cancer cells. In conclusion, we showed that m6ATM is a high-performance m6A detection tool and our results paved the way for epitranscriptomic precision medicine.