Dataset Information

Analysing Syntactic Regularities and Irregularities in SNOMED-CT.

ABSTRACT:

Motivation

In this paper we demonstrate the usage of RIO; a framework for detecting syntactic regularities using cluster analysis of the entities in the signature of an ontology. Quality assurance in ontologies is vital for their use in real applications, as well as a complex and difficult task. It is also important to have such methods and tools when the ontology lacks documentation and the user cannot consult the ontology developers to understand its construction. One aspect of quality assurance is checking how well an ontology complies with established 'coding standards'; is the ontology regular in how descriptions of different types of entities are axiomatised? Is there a similar way to describe them and are there any corner cases that are not covered by a pattern? Detection of regularities and irregularities in axiom patterns should provide ontology authors and quality inspectors with a level of abstraction such that compliance to coding standards can be automated. However, there is a lack of such reverse ontology engineering methods and tools.

Results

RIO framework allows regularities to be detected in an OWL ontology, i.e. repetitive structures in the axioms of an ontology. We describe the use of standard machine learning approaches to make clusters of similar entities and generalise over their axioms to find regularities. This abstraction allows matches to, and deviations from, an ontology's patterns to be shown. We demonstrate its usage with the inspection of three modules from SNOMED-CT, a large medical terminology, that cover "Present" and "Absent" findings, as well as "Chronic" and "Acute" findings. The module sizes are 5 065, 20 688 and 19 812 asserted axioms. They are analysed in terms of their types and number of regularities and irregularities in the asserted axioms of the ontology. The analysis showed that some modules of the terminology, which were expected to instantiate a pattern described in the SNOMED-CT technical guide, were found to have a high number of regularity deviations. A subset of these were categorised as "design defects" by verifying them with past work on the quality assurance of SNOMED-CT. These were mainly incomplete descriptions. In the worst case, the expected patterns described in the technical guide were followed by only 5% of the axioms in the module.

Conclusion

It is possible to automatically detect regularities and then inspect irregularities in an ontology. We argue that RIO is a tool to find and report such matches and mismatches, for evaluations by the domain experts. We have demonstrated that standard clustering techniques from machine learning can offer a tool in the drive for quality assurance in ontologies.

Availability

http://riotool.sourceforge.net/

Contact

http://eleni.mikroyannidi@manchester.ac.uk, http://robert.stevens@manchehster.ac.uk.

SUBMITTER: Mikroyannidi E

PROVIDER: S-EPMC3637289 | biostudies-literature | 2012 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Analysing Syntactic Regularities and Irregularities in SNOMED-CT.

Mikroyannidi Eleni E Stevens Robert R Iannone Luigi L Rector Alan A

Journal of biomedical semantics 20121217 1

<h4>Motivation</h4>In this paper we demonstrate the usage of RIO; a framework for detecting syntactic regularities using cluster analysis of the entities in the signature of an ontology. Quality assurance in ontologies is vital for their use in real applications, as well as a complex and difficult task. It is also important to have such methods and tools when the ontology lacks documentation and the user cannot consult the ontology developers to understand its construction. One aspect of quality ...[more]

PMID: 23244503

Dataset Information

Analysing Syntactic Regularities and Irregularities in SNOMED-CT.

Motivation

Results

Conclusion

Availability

Contact

Publications

Analysing Syntactic Regularities and Irregularities in SNOMED-CT.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Mapping Hungarian procedure codes to SNOMED CT.
| S-EPMC10585817 | biostudies-literature

Literature review of SNOMED CT use.
| S-EPMC3957381 | biostudies-other

Translating and evaluating historic phenotyping algorithms using SNOMED CT.
| S-EPMC9846670 | biostudies-literature

The use of SNOMED CT, 2013-2020: a literature review.
| S-EPMC8363812 | biostudies-literature

Qualitative analysis of manual annotations of clinical text with SNOMED CT.
| S-EPMC6307753 | biostudies-literature

A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships.
| S-EPMC6283061 | biostudies-literature

Scalable quality assurance for large SNOMED CT hierarchies using subject-based subtaxonomies.
| S-EPMC6283060 | biostudies-literature

Semantic validation of the use of SNOMED CT in HL7 clinical documents.
| S-EPMC3152505 | biostudies-literature

Implementing description-logic rules for SNOMED-CT attributes through a table-driven approach.
| S-EPMC3000783 | biostudies-literature

Enriching a primary health care version of ICD-10 using SNOMED CT mapping.
| S-EPMC2908062 | biostudies-literature