Unknown

Dataset Information

0

Automating terminological networks to link heterogeneous biomedical databases.


ABSTRACT: As cross-disciplinary research escalates, researchers are facing the challenge of linking disparate biomedical databases that have been developed without common indexes. Manually indexing these large-scale databases is laborious and often impractical. Solutions involving mediating terminologies have been proposed, but coordination of terms from the databases of interest to these mediating terminologies is also laborious, and regular synchronization between indexes is an additional problem. In this study we describe a novel method of linking heterogeneous databases using terminology networks constructed with automated mapping methods. Linkage was established between two disparate biomedical databases (SNOMED-CT and HDG), using two relevant intermediating databases (UMLS and OMIM). One gold standard of 514 distinct matches is used as proof-of-principle. In conclusion, as hypothesized, 1) Manually curated pathways provide high precision, but offer low recall, 2) the automated terminology pathways can significantly increase recall at acceptable precision. Taken together, our conclusion may suggest the combined manual and automated terminology networks could offer recall and precision in an incremental manner

SUBMITTER: Wang X 

PROVIDER: S-EPMC2917348 | biostudies-literature | 2004

REPOSITORIES: biostudies-literature

altmetric image

Publications

Automating terminological networks to link heterogeneous biomedical databases.

Wang Xiaoyan X   Quek Hui Nar HN   Cantor Michael M   Kra Pauline P   Schultz Aylit A   Lussier Yves A YA  

Studies in health technology and informatics 20040101 Pt 1


As cross-disciplinary research escalates, researchers are facing the challenge of linking disparate biomedical databases that have been developed without common indexes. Manually indexing these large-scale databases is laborious and often impractical. Solutions involving mediating terminologies have been proposed, but coordination of terms from the databases of interest to these mediating terminologies is also laborious, and regular synchronization between indexes is an additional problem. In th  ...[more]

Similar Datasets

| S-EPMC3228855 | biostudies-literature
| S-EPMC1764450 | biostudies-literature
| S-EPMC5860112 | biostudies-literature
| S-EPMC2335285 | biostudies-literature
| S-EPMC4448321 | biostudies-literature
| S-EPMC10517490 | biostudies-literature
| S-EPMC4926749 | biostudies-literature
| S-EPMC5963080 | biostudies-other
| S-EPMC5393592 | biostudies-literature
| S-EPMC3314880 | biostudies-other