Models

Dataset Information

0

Deshpande2019 - Random Forest model to predict long non-coding RNAs from coding RNAs in Zea Mays plant transcriptomic data


ABSTRACT: This is a Random Forest algorithm-based machine learning model to predict lncRNAs from coding mRNAs in plant transcriptomic data. The model assigns 1 for coding sequences and 2 for long non-coding sequences. The prediction is performed using a combination of Open Reading Frame (ORF) based, Sequence-based and Codon-bias features. Users need to download the curated ONNX model and also need to convert the sequences into feature matrix as mentioned in PLIT paper (Deshpande et al. 2019) to make predictions on sequences from Zea Mays sequence data.

SUBMITTER: Sumukh Deshpande  

PROVIDER: BIOMD0000001067 | BioModels | 2023-05-22

REPOSITORIES: BioModels

altmetric image

Publications

PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets.

Deshpande Sumukh S   Shuttleworth James J   Yang Jianhua J   Taramonli Sandy S   England Matthew M  

Computers in biology and medicine 20190104


Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) tools fail to accurately identify them in transcriptomic data. Well-known CPC tools such as CPC2, lncS  ...[more]

Similar Datasets

2017-11-08 | GSE98428 | GEO
2017-09-27 | GSE104252 | GEO
2012-01-07 | GSE34449 | GEO
2022-12-15 | GSE211655 | GEO
2014-04-15 | E-GEOD-50747 | biostudies-arrayexpress
2012-01-07 | E-GEOD-34449 | biostudies-arrayexpress
2022-10-12 | MSV000090519 | MassIVE
2024-07-17 | GSE268261 | GEO
2010-04-12 | E-GEOD-19937 | biostudies-arrayexpress
2015-12-25 | GSE75290 | GEO