Unknown

Dataset Information

0

Machine learning for biomedical literature triage.


ABSTRACT: This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three different machine learning algorithms (Naive Bayes, Support Vector Machine and Logistic Model Trees). The results show that the most fitting model to handle the imbalanced datasets of the triage classification task is obtained by using domain relevant features, an under-sampling technique, and the Logistic Model Trees algorithm.

SUBMITTER: Almeida H 

PROVIDER: S-EPMC4281078 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Machine learning for biomedical literature triage.

Almeida Hayda H   Meurs Marie-Jean MJ   Kosseim Leila L   Butler Greg G   Tsang Adrian A  

PloS one 20141231 12


This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three different machine learning algorithms (Naive Bayes, Support Vector Machine and Logistic Model Trees). The results show that the most fitting model to handle the imbalanced datasets of the triage classificati  ...[more]

Similar Datasets

| S-EPMC9199879 | biostudies-literature
| S-EPMC8461527 | biostudies-literature
| S-EPMC9621252 | biostudies-literature
| S-EPMC9882585 | biostudies-literature
2013-12-23 | E-GEOD-53091 | biostudies-arrayexpress
| S-EPMC11854074 | biostudies-literature
| S-EPMC6054406 | biostudies-literature
| S-EPMC8716730 | biostudies-literature
| S-EPMC8212142 | biostudies-literature
| S-EPMC9848046 | biostudies-literature