Project description:N6-methyladenosine (m6A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing, such as RNA stability and translation. miCLIP (m6A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m6A sites in the transcriptome with single-nucleotide resolution. However, due to broad antibody reactivity, reliable identification of m6A sites from miCLIP data remains challenging. Here, we present several experimental and computational innovations that significantly improve transcriptome-wide detection of m6A sites. Based on the recently developed iCLIP2 protocol, the optimised miCLIP2 results in high-complexity libraries using less input material, leading to a more comprehensive representation of m6A sites. Next, we established a robust computational pipeline to identify true m6A sites from our miCLIP2 data. The analyses are calibrated with data from Mettl3 knockout cells to learn the characteristics of m6A deposition, including a significant number of m6A sites outside of DRACH motifs. In order to make these results universally applicable, we trained a machine learning model, m6Aboost, based on the experimental and RNA sequence features. Importantly, m6Aboost allows prediction of genuine m6A sites in miCLIP data without filtering for DRACH motifs or the need for Mettl3 depletion. Using m6Aboost, we identify thousands of high-confidence m6A sites in different murine and human cell lines, which provide a rich resource for future analysis. Collectively, our combined experimental and computational methodology greatly improves m6A identification.
Project description:N6-methyladenosine (m6A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing, such as RNA stability and translation. miCLIP (m6A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m6A sites in the transcriptome with single-nucleotide resolution. However, due to broad antibody reactivity, reliable identification of m6A sites from miCLIP data remains challenging. Here, we present several experimental and computational innovations that significantly improve transcriptome-wide detection of m6A sites. Based on the recently developed iCLIP2 protocol, the optimised miCLIP2 results in high-complexity libraries using less input material, leading to a more comprehensive representation of m6A sites. Next, we established a robust computational pipeline to identify true m6A sites from our miCLIP2 data. The analyses are calibrated with data from Mettl3 knockout cells to learn the characteristics of m6A deposition, including a significant number of m6A sites outside of DRACH motifs. In order to make these results universally applicable, we trained a machine learning model, m6Aboost, based on the experimental and RNA sequence features. Importantly, m6Aboost allows prediction of genuine m6A sites in miCLIP data without filtering for DRACH motifs or the need for Mettl3 depletion. Using m6Aboost, we identify thousands of high-confidence m6A sites in different murine and human cell lines, which provide a rich resource for future analysis. Collectively, our combined experimental and computational methodology greatly improves m6A identification.
Project description:N6-methyladenosine (m6A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing, such as RNA stability and translation. miCLIP (m6A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m6A sites in the transcriptome with single-nucleotide resolution. However, due to broad antibody reactivity, reliable identification of m6A sites from miCLIP data remains challenging. Here, we present several experimental and computational innovations that significantly improve transcriptome-wide detection of m6A sites. Based on the recently developed iCLIP2 protocol, the optimised miCLIP2 results in high-complexity libraries using less input material, leading to a more comprehensive representation of m6A sites. Next, we established a robust computational pipeline to identify true m6A sites from our miCLIP2 data. The analyses are calibrated with data from Mettl3 knockout cells to learn the characteristics of m6A deposition, including a significant number of m6A sites outside of DRACH motifs. In order to make these results universally applicable, we trained a machine learning model, m6Aboost, based on the experimental and RNA sequence features. Importantly, m6Aboost allows prediction of genuine m6A sites in miCLIP data without filtering for DRACH motifs or the need for Mettl3 depletion. Using m6Aboost, we identify thousands of high-confidence m6A sites in different murine and human cell lines, which provide a rich resource for future analysis. Collectively, our combined experimental and computational methodology greatly improves m6A identification.
Project description:N6-methyladenosine (m6A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing. miCLIP (m6A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m6A sites with single-nucleotide resolution. However, due to broad antibody reactivity, reliable identification of m6A sites from miCLIP data remains challenging. Here, we present miCLIP2 in combination with machine learning to significantly improve m6A detection. The optimized miCLIP2 results in high-complexity libraries from less input material. Importantly, we established a robust computational pipeline to tackle the inherent issue of false positives in antibody-based m6A detection. The analyses were calibrated with Mettl3 knockout cells to learn the characteristics of m6A deposition, including m6A sites outside of DRACH motifs. To make our results universally applicable, we trained a machine learning model, m6Aboost, based on the experimental and RNA sequence features. Importantly, m6Aboost allows prediction of genuine m6A sites in miCLIP2 data without filtering for DRACH motifs or the need for Mettl3 depletion. Using m6Aboost, we identify thousands of high-confidence m6A sites in different murine and human cell lines, which provide a rich resource for future analysis. Collectively, our combined experimental and computational methodology greatly improves m6A identification.
Project description:Gene expression profiles were generated from 199 primary breast cancer patients. Samples 1-176 were used in another study, GEO Series GSE22820, and form the training data set in this study. Sample numbers 200-222 form a validation set. This data is used to model a machine learning classifier for Estrogen Receptor Status. RNA was isolated from 199 primary breast cancer patients. A machine learning classifier was built to predict ER status using only three gene features.
Project description:To extract urinary proteome spectral features based on advanced mass spectrometer and machine learning algorithms, it could get more accurate reporting results for disease classification. We tried to establish a novel diagnosis model of kidney diseases by combining machine learning XGBoost algorithm with complete urinary proteomic information.