Unknown

Dataset Information

0

MiSTAR: miRNA target prediction through modeling quantitative and qualitative miRNA binding site information in a stacked model structure.


ABSTRACT: In microRNA (miRNA) target prediction, typically two levels of information need to be modeled: the number of potential miRNA binding sites present in a target mRNA and the genomic context of each individual site. Single model structures insufficiently cope with this complex training data structure, consisting of feature vectors of unequal length as a consequence of the varying number of miRNA binding sites in different mRNAs. To circumvent this problem, we developed a two-layered, stacked model, in which the influence of binding site context is separately modeled. Using logistic regression and random forests, we applied the stacked model approach to a unique data set of 7990 probed miRNA-mRNA interactions, hereby including the largest number of miRNAs in model training to date. Compared to lower-complexity models, a particular stacked model, named miSTAR (miRNA stacked model target prediction; www.mi-star.org), displays a higher general performance and precision on top scoring predictions. More importantly, our model outperforms published and widely used miRNA target prediction algorithms. Finally, we highlight flaws in cross-validation schemes for evaluation of miRNA target prediction models and adopt a more fair and stringent approach.

SUBMITTER: Van Peer G 

PROVIDER: S-EPMC5397177 | biostudies-literature | 2017 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

miSTAR: miRNA target prediction through modeling quantitative and qualitative miRNA binding site information in a stacked model structure.

Van Peer Gert G   De Paepe Ayla A   Stock Michiel M   Anckaert Jasper J   Volders Pieter-Jan PJ   Vandesompele Jo J   De Baets Bernard B   Waegeman Willem W  

Nucleic acids research 20170401 7


In microRNA (miRNA) target prediction, typically two levels of information need to be modeled: the number of potential miRNA binding sites present in a target mRNA and the genomic context of each individual site. Single model structures insufficiently cope with this complex training data structure, consisting of feature vectors of unequal length as a consequence of the varying number of miRNA binding sites in different mRNAs. To circumvent this problem, we developed a two-layered, stacked model,  ...[more]

Similar Datasets

| S-EPMC8453239 | biostudies-literature
2013-05-25 | E-GEOD-46611 | biostudies-arrayexpress
2013-05-25 | GSE46611 | GEO
| S-EPMC4243195 | biostudies-literature
| S-EPMC3818907 | biostudies-literature
| S-EPMC4605288 | biostudies-literature
| S-EPMC10343850 | biostudies-literature
| S-EPMC3583062 | biostudies-literature
| S-EPMC3790831 | biostudies-literature
| S-EPMC5587804 | biostudies-other