Unknown

Dataset Information

0

Detecting m6A RNA modification from nanopore sequencing using a semi-supervised learning framework.


ABSTRACT: Direct nanopore-based RNA sequencing can be used to detect post-transcriptional base modifications, such as m6A methylation, based on the electric current signals produced by the distinct chemical structures of modified bases. A key challenge is the scarcity of adequate training data with known methylation modifications. We present Xron, a hybrid encoder-decoder framework that delivers a direct methylation-distinguishing basecaller by training on synthetic RNA data and immunoprecipitation-based experimental data in two steps. First, we generate data with more diverse modification combinations through in silico cross-linking. Second, we use this dataset to train an end-to-end neural network basecaller followed by fine-tuning on immunoprecipitation-based experimental data with label-smoothing. The trained neural network basecaller outperforms existing methylation detection methods on both read-level and site-level prediction scores. Xron is a standalone, end-to-end m6A-distinguishing basecaller capable of detecting methylated bases directly from raw sequencing signals, enabling de novo methylome assembly.

SUBMITTER: Teng H 

PROVIDER: S-EPMC10802372 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Detecting m6A RNA modification from nanopore sequencing using a semi-supervised learning framework.

Teng Haotian H   Stoiber Marcus M   Bar-Joseph Ziv Z   Kingsford Carl C  

bioRxiv : the preprint server for biology 20240107


Direct nanopore-based RNA sequencing can be used to detect post-transcriptional base modifications, such as m6A methylation, based on the electric current signals produced by the distinct chemical structures of modified bases. A key challenge is the scarcity of adequate training data with known methylation modifications. We present Xron, a hybrid encoder-decoder framework that delivers a direct methylation-distinguishing basecaller by training on synthetic RNA data and immunoprecipitation-based  ...[more]

Similar Datasets

| S-EPMC11610579 | biostudies-literature
| S-EPMC6959997 | biostudies-literature
2019-11-13 | GSE140262 | GEO
| S-EPMC9718678 | biostudies-literature
| S-EPMC11495873 | biostudies-literature
| S-EPMC9991887 | biostudies-literature
| S-EPMC9374783 | biostudies-literature
| S-EPMC8058768 | biostudies-literature
| S-EPMC10818168 | biostudies-literature
2024-10-28 | GSE265754 | GEO