Unknown

Dataset Information

0

GOAnnotator: linking protein GO annotations to evidence text.


ABSTRACT: BACKGROUND:Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators. RESULTS:In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins. CONCLUSION:The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one to achieve high precision, which is crucial for the efficient support of GO curators. GOAnnotator was implemented as a web tool that is freely available at http://xldb.di.fc.ul.pt/rebil/tools/goa/.

SUBMITTER: Couto FM 

PROVIDER: S-EPMC1769513 | biostudies-literature | 2006 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

GOAnnotator: linking protein GO annotations to evidence text.

Couto Francisco M FM   Silva Mário J MJ   Lee Vivian V   Dimmer Emily E   Camon Evelyn E   Apweiler Rolf R   Kirsch Harald H   Rebholz-Schuhmann Dietrich D  

Journal of biomedical discovery and collaboration 20061220


<h4>Background</h4>Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators.<h4>Results</h4>In this paper we  ...[more]

Similar Datasets

| S-EPMC3405096 | biostudies-literature
| S-EPMC2335285 | biostudies-literature
| S-EPMC1538768 | biostudies-literature
| S-EPMC7806674 | biostudies-literature
| S-EPMC5521088 | biostudies-literature
| S-EPMC5625555 | biostudies-literature
| S-EPMC6307753 | biostudies-literature
| S-EPMC6030983 | biostudies-literature
| S-EPMC9300714 | biostudies-literature
| S-EPMC4112614 | biostudies-literature