Unknown

Dataset Information

0

PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data.


ABSTRACT: More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3' untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3'UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based machine learning architecture and an improved vector projection-based engine to infer differential APA dynamics accurately. When applied to earlier studies, PolyAMiner-Bulk accurately identified more than twice the number of APA changes in an RBM17 knockdown bulk RNA-seq dataset compared to current generation tools. Moreover, on a separate dataset, PolyAMiner-Bulk revealed novel APA dynamics and pathways in scleroderma pathology and identified differential APA in a gene that was identified as being involved in scleroderma pathogenesis in an independent study. Lastly, we used PolyAMiner-Bulk to analyze the RNA-seq data of post-mortem prefrontal cortexes from the ROSMAP data consortium and unraveled novel APA dynamics in Alzheimer's Disease. Our method, PolyAMiner-Bulk, creates a paradigm for future alternative polyadenylation analysis from bulk RNA-seq data.

SUBMITTER: Jonnakuti VS 

PROVIDER: S-EPMC9900750 | biostudies-literature | 2023 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data.

Jonnakuti Venkata Soumith VS   Wagner Eric J EJ   Maletić-Savatić Mirjana M   Liu Zhandong Z   Yalamanchili Hari Krishna HK  

bioRxiv : the preprint server for biology 20230124


More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3' untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3'UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based mac  ...[more]

Similar Datasets

| S-EPMC7320624 | biostudies-literature
| S-EPMC10950478 | biostudies-literature
| S-EPMC9162953 | biostudies-literature
2024-12-27 | GSE246294 | GEO
| S-EPMC6813887 | biostudies-literature
| S-EPMC9801043 | biostudies-literature
| S-EPMC9923804 | biostudies-literature
| S-EPMC9329739 | biostudies-literature
| S-EPMC8215916 | biostudies-literature
2019-08-14 | PXD014842 | Pride