Project description:N6-methyladenosine (m6A) has been one of the most abundant and well-known modifications in mRNA since its discovery in 1970s. Recent studies have demonstrated that m6A gets involved in various biological processes such as alternative splicing and RNA degradation, playing an important role in all kinds of diseases. To better understand the role of m6A, transcriptome-wide m6A profiling data is indispensable. In these years, the Oxford Nanopore Technology Direct RNA Sequencing (DRS) platform has shown promise in RNA modification detection based on current disruptions measured in transcripts. However, decoding current intensity data into modification profiles remains a challenging task. Here, we introduce m6A Transcriptome-wide Mapper (m6ATM), a novel Python-based computational pipeline that applies deep neural networks to predict m6A sites at single-base resolution using DRS data. The m6ATM model architecture incorporates a WaveNet encoder and a dual-stream multiple instance learning model to extract features from specific target sites and characterize the m6A epitranscriptome. For validation, m6ATM achieved an accuracy of 80 to 98% across in-vitro transcription datasets containing varying m6A modification ratios and outperformed other tools in benchmarking with human cell-line data. Moreover, we demonstrated the versatility of m6ATM in providing reliable stoichiometric information and used it to pinpoint PEG10 as a potential m6A target transcript in liver cancer cells. In conclusion, we showed that m6ATM is a high-performance m6A detection tool and our results paved the way for epitranscriptomic precision medicine.
Project description:N6-methyladenosine (m6A) has been one of the most abundant and well-known modifications in mRNA since its discovery in 1970s. Recent studies have demonstrated that m6A gets involved in various biological processes such as alternative splicing and RNA degradation, playing an important role in all kinds of diseases. To better understand the role of m6A, transcriptome-wide m6A profiling data is indispensable. In these years, the Oxford Nanopore Technology Direct RNA Sequencing (DRS) platform has shown promise in RNA modification detection based on current disruptions measured in transcripts. However, decoding current intensity data into modification profiles remains a challenging task. Here, we introduce m6A Transcriptome-wide Mapper (m6ATM), a novel Python-based computational pipeline that applies deep neural networks to predict m6A sites at single-base resolution using DRS data. The m6ATM model architecture incorporates a WaveNet encoder and a dual-stream multiple instance learning model to extract features from specific target sites and characterize the m6A epitranscriptome. For validation, m6ATM achieved an accuracy of 80 to 98% across in-vitro transcription datasets containing varying m6A modification ratios and outperformed other tools in benchmarking with human cell-line data. Moreover, we demonstrated the versatility of m6ATM in providing reliable stoichiometric information and used it to pinpoint PEG10 as a potential m6A target transcript in liver cancer cells. In conclusion, we showed that m6ATM is a high-performance m6A detection tool and our results paved the way for epitranscriptomic precision medicine.
Project description:RNA internal modifications play critical role in development of multicellular organisms and their response to environmental cues. Using nanopore direct RNA sequencing (DRS), we constructed a large in vitro epitranscriptome (IVET) resource from plant cDNA library labeled with m6A, m1A and m5C respectively. Furthermore, after transfer learning, the pre-trained model was used to detect additional RNA internal modification such as m1A, hm5C, m7G and Ψ modification. Finally, we illustrated a global view of epitranscriptome with m6A, m1A, m5C, m7G and Ψ modification in rice seedlings under normal and high salinity environment. In summary, we provided a strategy for creating IVET resource from cDNA library and developed a computational method that use IVET-based transfer learning termed TandemMod for profiling epitranscriptome landscape with co-occupancy of multiple types of RNA modification in plants responsive to environmental signal.
Project description:We developed a semi-supervised deep learning framework for the identification of doublets in scRNA-seq analysis called Solo. To validate our method, we used MULTI-seq, cholesterol modified oligos (CMOs), to experimentally identify doublets in a solid tissue with diverse cell types, mouse kidney, and showed Solo recapitulated experimentally identified doublets.
Project description:Here we present miR-eCLIP analysis of AGO2 in HEK293 cells to address the small RNA repertoire and uncover their physiological targets. We developed an optimized bioinformatics approach of chimeric read identification to detect chimeras of high confidence, which were useed as an biologically validated input for miRBind, a deep learning method and web-server that can be used to accurately predict the potential of miRNA:target site binding.
Project description:To be able to reliably generate theoretical libraries that can be used in SWATH experiments, we developed a prediction framework, deep-learning for SWATH analysis (dpSWATH), to improve the sensitivity and specificity of data generated by Q-TOF mass spectrometers. The theoretical library built by dpSWATH allowed us to increase the identification rate of proteins and peptides compared to traditional or library-free methods. Especially, the in-silico library built based on the transcriptome scale identified the most proteins while kept a similar FDR as DDA library. Based on our analysis we conclude that dpSWATH is superior in predicting libraries that can be used for SWATH-MS measurements compared to other algorithms that are based on Orbitrap data.
Project description:This dataset contains Xdrop followed by oxford nanopore long read sequencing performed in target tRNA gene deletion (t8) and intergenic region deletion (i50) clones in HepG2 . By applying de novo assembly based approach to Xdrop-LRS data, we identified Cas9-induced on-target genomic alteration.