Unknown

Dataset Information

0

Terminologies for text-mining; an experiment in the lipoprotein metabolism domain.


ABSTRACT: BACKGROUND: The engineering of ontologies, especially with a view to a text-mining use, is still a new research field. There does not yet exist a well-defined theory and technology for ontology construction. Many of the ontology design steps remain manual and are based on personal experience and intuition. However, there exist a few efforts on automatic construction of ontologies in the form of extracted lists of terms and relations between them. RESULTS: We share experience acquired during the manual development of a lipoprotein metabolism ontology (LMO) to be used for text-mining. We compare the manually created ontology terms with the automatically derived terminology from four different automatic term recognition (ATR) methods. The top 50 predicted terms contain up to 89% relevant terms. For the top 1000 terms the best method still generates 51% relevant terms. In a corpus of 3066 documents 53% of LMO terms are contained and 38% can be generated with one of the methods. CONCLUSIONS: Given high precision, automatic methods can help decrease development time and provide significant support for the identification of domain-specific vocabulary. The coverage of the domain vocabulary depends strongly on the underlying documents. Ontology development for text mining should be performed in a semi-automatic way; taking ATR results as input and following the guidelines we described. AVAILABILITY: The TFIDF term recognition is available as Web Service, described at http://gopubmed4.biotec.tu-dresden.de/IdavollWebService/services/CandidateTermGeneratorService?wsdl.

SUBMITTER: Alexopoulou D 

PROVIDER: S-EPMC2367629 | biostudies-literature | 2008

REPOSITORIES: biostudies-literature

altmetric image

Publications

Terminologies for text-mining; an experiment in the lipoprotein metabolism domain.

Alexopoulou Dimitra D   Wächter Thomas T   Pickersgill Laura L   Eyre Cecilia C   Schroeder Michael M  

BMC bioinformatics 20080425


<h4>Background</h4>The engineering of ontologies, especially with a view to a text-mining use, is still a new research field. There does not yet exist a well-defined theory and technology for ontology construction. Many of the ontology design steps remain manual and are based on personal experience and intuition. However, there exist a few efforts on automatic construction of ontologies in the form of extracted lists of terms and relations between them.<h4>Results</h4>We share experience acquire  ...[more]

Similar Datasets

| S-EPMC4874549 | biostudies-literature
| S-EPMC4674139 | biostudies-literature
| S-EPMC5975701 | biostudies-literature
| S-EPMC2217579 | biostudies-literature
| S-EPMC2374703 | biostudies-literature
| S-EPMC3939821 | biostudies-literature
| S-EPMC7148005 | biostudies-literature
| S-EPMC6550425 | biostudies-literature
| S-EPMC3475109 | biostudies-literature
| S-EPMC2735884 | biostudies-literature