Unknown

Dataset Information

0

Automatic recognition of self-acknowledged limitations in clinical research literature.


ABSTRACT: Objective:To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency. Methods:To develop our recognition methods, we used a set of 8431 sentences from 1197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing, and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a given sentence from a publication discusses self-acknowledged limitations or not. We experimented with three methods: a rule-based approach based on document structure, supervised machine learning, and a semi-supervised method that uses self-training to expand the training set in order to improve classification performance. The machine learning algorithms used were logistic regression (LR) and support vector machines (SVM). Results:Annotators had good agreement in labeling limitation sentences (Krippendorff's ??=?0.781). Of the three methods used, the rule-based method yielded the best performance with 91.5% accuracy (95% CI [90.1-92.9]), while self-training with SVM led to a small improvement over fully supervised learning (89.9%, 95% CI [88.4-91.4] vs 89.6%, 95% CI [88.1-91.1]). Conclusions:The approach presented can be incorporated into the workflows of stakeholders focusing on research transparency to improve reporting of limitations in clinical studies.

SUBMITTER: Kilicoglu H 

PROVIDER: S-EPMC6016608 | biostudies-literature | 2018 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Automatic recognition of self-acknowledged limitations in clinical research literature.

Kilicoglu Halil H   Rosemblat Graciela G   Malicki Mario M   Ter Riet Gerben G  

Journal of the American Medical Informatics Association : JAMIA 20180701 7


<h4>Objective</h4>To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency.<h4>Methods</h4>To develop our recognition methods, we used a set of 8431 sentences from 1197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing, and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a gi  ...[more]

Similar Datasets

| S-EPMC9472736 | biostudies-literature
| S-EPMC1559726 | biostudies-literature
| S-EPMC8621146 | biostudies-literature
| S-EPMC6928703 | biostudies-literature
| S-EPMC3183066 | biostudies-literature
| S-EPMC8779018 | biostudies-literature
| S-EPMC4947568 | biostudies-literature
| S-EPMC6609919 | biostudies-other
| S-EPMC5920444 | biostudies-literature