Unknown

Dataset Information

0

Accelerating annotation of articles via automated approaches: evaluation of the neXtA5 curation-support tool by neXtProt.


ABSTRACT: The development of efficient text-mining tools promises to boost the curation workflow by significantly reducing the time needed to process the literature into biological databases. We have developed a curation support tool, neXtA5, that provides a search engine coupled with an annotation system directly integrated into a biocuration workflow. neXtA5 assists curation with modules optimized for the thevarious curation tasks: document triage, entity recognition and information extraction.Here, we describe the evaluation of neXtA5 by expert curators. We first assessed the annotations of two independent curators to provide a baseline for comparison. To evaluate the performance of neXtA5, we submitted requests and compared the neXtA5 results with the manual curation. The analysis focuses on the usability of neXtA5 to support the curation of two types of data: biological processes (BPs) and diseases (Ds). We evaluated the relevance of the papers proposed as well as the recall and precision of the suggested annotations.The evaluation of document triage by neXtA5 precision showed that both curators agree with neXtA5 for 67 (BP) and 63% (D) of abstracts, while curators agree on accepting or rejecting an abstract ~80% of the time. Hence, the precision of the triage system is satisfactory.For concept extraction, curators approved 35 (BP) and 25% (D) of the neXtA5 annotations. Conversely, neXtA5 successfully annotated up to 36 (BP) and 68% (D) of the terms identified by curators. The user feedback obtained in these tests highlighted the need for improvement in the ranking function of neXtA5 annotations. Therefore, we transformed the information extraction component into an annotation ranking system. This improvement results in a top precision (precision at first rank) of 59 (D) and 63% (BP). These results suggest that when considering only the first extracted entity, the current system achieves a precision comparable with expert biocurators.

SUBMITTER: Britan A 

PROVIDER: S-EPMC6301339 | biostudies-literature | 2018 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accelerating annotation of articles via automated approaches: evaluation of the neXtA5 curation-support tool by neXtProt.

Britan Aurore A   Cusin Isabelle I   Hinard Valérie V   Mottin Luc L   Pasche Emilie E   Gobeill Julien J   Rech de Laval Valentine V   Gleizes Anne A   Teixeira Daniel D   Michel Pierre-André PA   Ruch Patrick P   Gaudet Pascale P  

Database : the journal of biological databases and curation 20180101


The development of efficient text-mining tools promises to boost the curation workflow by significantly reducing the time needed to process the literature into biological databases. We have developed a curation support tool, neXtA5, that provides a search engine coupled with an annotation system directly integrated into a biocuration workflow. neXtA5 assists curation with modules optimized for the thevarious curation tasks: document triage, entity recognition and information extraction.Here, we  ...[more]

Similar Datasets

| S-EPMC4930835 | biostudies-literature
| S-EPMC6602571 | biostudies-literature
| S-EPMC7288430 | biostudies-literature
| S-EPMC7723322 | biostudies-literature
| S-EPMC5467557 | biostudies-literature
| S-EPMC4236444 | biostudies-literature
2020-06-03 | MSV000085540 | MassIVE
| S-EPMC1933441 | biostudies-literature
| S-EPMC5753331 | biostudies-literature
| S-EPMC1539029 | biostudies-literature