Unknown

Dataset Information

0

Examining influential factors for acknowledgements classification using supervised learning.


ABSTRACT: Acknowledgements have been examined as important elements in measuring the contributions to and intellectual debts of a scientific publication. Unlike previous studies that were limited in the scope of analysis and manual examination. The present study aimed to conduct the automatic classification of acknowledgements on a large scale of data. To this end, we first created a training dataset for acknowledgements classification by sampling the acknowledgements sections from the entire PubMed Central database. Second, we adopted various supervised learning algorithms to examine which algorithm performed best in what condition. In addition, we observed the factors affecting classification performance. We investigated the effects of the following three main aspects: classification algorithms, categories, and text representations. The CNN+Doc2Vec algorithm achieved the highest performance of 93.58% accuracy in the original dataset and 87.93% in the converted dataset. The experimental results indicated that the characteristics of categories and sentence patterns influenced the performance of classification. Most of the classifiers performed better on the categories of financial, peer interactive communication, and technical support compared to other classes.

SUBMITTER: Song M 

PROVIDER: S-EPMC7021295 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

Examining influential factors for acknowledgements classification using supervised learning.

Song Min M   Kang Keun Young KY   Timakum Tatsawan T   Zhang Xinyuan X  

PloS one 20200214 2


Acknowledgements have been examined as important elements in measuring the contributions to and intellectual debts of a scientific publication. Unlike previous studies that were limited in the scope of analysis and manual examination. The present study aimed to conduct the automatic classification of acknowledgements on a large scale of data. To this end, we first created a training dataset for acknowledgements classification by sampling the acknowledgements sections from the entire PubMed Centr  ...[more]

Similar Datasets

| S-EPMC3093382 | biostudies-literature
2022-02-17 | PXD025594 | Pride
| S-EPMC7856146 | biostudies-literature
| S-EPMC6264004 | biostudies-literature
| S-EPMC8582378 | biostudies-literature
| S-EPMC4289190 | biostudies-literature
| S-EPMC8722080 | biostudies-literature
| S-EPMC9753267 | biostudies-literature
| S-EPMC4147614 | biostudies-literature
| S-EPMC7551840 | biostudies-literature