Unknown

Dataset Information

0

An extensive review of tools for manual annotation of documents.


ABSTRACT:

Motivation

Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms. Further, annotation tools are also used to extract new information for a particular use case. However, owing to the high number of existing annotation tools, finding the one that best fits particular needs is a demanding task that requires searching the scientific literature followed by installing and trying various tools.

Methods

We searched for annotation tools and selected a subset of them according to five requirements with which they should comply, such as being Web-based or supporting the definition of a schema. We installed the selected tools (when necessary), carried out hands-on experiments and evaluated them using 26 criteria that covered functional and technical aspects. We defined each criterion on three levels of matches and a score for the final evaluation of the tools.

Results

We evaluated 78 tools and selected the following 15 for a detailed evaluation: BioQRator, brat, Catma, Djangology, ezTag, FLAT, LightTag, MAT, MyMiner, PDFAnno, prodigy, tagtog, TextAE, WAT-SL and WebAnno. Full compliance with our 26 criteria ranged from only 9 up to 20 criteria, which demonstrated that some tools are comprehensive and mature enough to be used on most annotation projects. The highest score of 0.81 was obtained by WebAnno (of a maximum value of 1.0).

SUBMITTER: Neves M 

PROVIDER: S-EPMC7820865 | biostudies-literature | 2021 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

An extensive review of tools for manual annotation of documents.

Neves Mariana M   Ševa Jurica J  

Briefings in bioinformatics 20210101 1


<h4>Motivation</h4>Annotation tools are applied to build training and test corpora, which are essential for the development and evaluation of new natural language processing algorithms. Further, annotation tools are also used to extract new information for a particular use case. However, owing to the high number of existing annotation tools, finding the one that best fits particular needs is a demanding task that requires searching the scientific literature followed by installing and trying vari  ...[more]

Similar Datasets

| S-EPMC8501643 | biostudies-literature
| S-EPMC6299973 | biostudies-literature
| S-EPMC5668921 | biostudies-other
| S-EPMC5111626 | biostudies-literature
| S-EPMC10502000 | biostudies-literature
| S-EPMC8342444 | biostudies-literature
| S-EPMC2516305 | biostudies-literature
| S-EPMC3706743 | biostudies-literature
| S-EPMC8671438 | biostudies-literature
| S-EPMC10579860 | biostudies-literature