Dataset Information

Exploring semantic deep learning for building reliable and reusable one health knowledge from PubMed systematic reviews and veterinary clinical notes.

ABSTRACT:

Background

Deep Learning opens up opportunities for routinely scanning large bodies of biomedical literature and clinical narratives to represent the meaning of biomedical and clinical terms. However, the validation and integration of this knowledge on a scale requires cross checking with ground truths (i.e. evidence-based resources) that are unavailable in an actionable or computable form. In this paper we explore how to turn information about diagnoses, prognoses, therapies and other clinical concepts into computable knowledge using free-text data about human and animal health. We used a Semantic Deep Learning approach that combines the Semantic Web technologies and Deep Learning to acquire and validate knowledge about 11 well-known medical conditions mined from two sets of unstructured free-text data: 300?K PubMed Systematic Review articles (the PMSB dataset) and 2.5?M veterinary clinical notes (the VetCN dataset). For each target condition we obtained 20 related clinical concepts using two deep learning methods applied separately on the two datasets, resulting in 880 term pairs (target term, candidate term). Each concept, represented by an n-gram, is mapped to UMLS using MetaMap; we also developed a bespoke method for mapping short forms (e.g. abbreviations and acronyms). Existing ontologies were used to formally represent associations. We also create ontological modules and illustrate how the extracted knowledge can be queried. The evaluation was performed using the content within BMJ Best Practice.

Results

MetaMap achieves an F measure of 88% (precision 85%, recall 91%) when applied directly to the total of 613 unique candidate terms for the 880 term pairs. When the processing of short forms is included, MetaMap achieves an F measure of 94% (precision 92%, recall 96%). Validation of the term pairs with BMJ Best Practice yields precision between 98 and 99%.

Conclusions

The Semantic Deep Learning approach can transform neural embeddings built from unstructured free-text data into reliable and reusable One Health knowledge using ontologies and content from BMJ Best Practice.

SUBMITTER: Arguello-Casteleiro M

PROVIDER: S-EPMC6849172 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Exploring semantic deep learning for building reliable and reusable one health knowledge from PubMed systematic reviews and veterinary clinical notes.

Arguello-Casteleiro Mercedes M Stevens Robert R Des-Diz Julio J Wroe Chris C Fernandez-Prieto Maria Jesus MJ Maroto Nava N Maseda-Fernandez Diego D Demetriou George G Peters Simon S Noble Peter-John M PM Jones Phil H PH Dukes-McEwan Jo J Radford Alan D AD Keane John J Nenadic Goran G

Journal of biomedical semantics 20191112 Suppl 1

<h4>Background</h4>Deep Learning opens up opportunities for routinely scanning large bodies of biomedical literature and clinical narratives to represent the meaning of biomedical and clinical terms. However, the validation and integration of this knowledge on a scale requires cross checking with ground truths (i.e. evidence-based resources) that are unavailable in an actionable or computable form. In this paper we explore how to turn information about diagnoses, prognoses, therapies and other c ...[more]

PMID: 31711540

Similar Datasets

Project description:Evidence-based decision making is a hallmark of effective veterinary clinical practice. Scoping reviews, systematic reviews, and meta-analyses all are methods intended to provide transparent and replicable ways of summarizing a body of research to address an important clinical or public health issue. As these methods increasingly are being used by researchers and read by practitioners, it is important to understand the distinction between these techniques and to understand what research questions they can, and cannot, address. This review provides an overview of scoping reviews, systematic reviews, and meta-analysis, including a discussion of the method and uses. A sample dataset and coding to conduct a simple meta-analysis in the statistical program R also are provided. Scoping reviews are a descriptive approach, designed to chart the literature around a particular topic. The approach involves an extensive literature search, following by a structured mapping, or charting, of the literature. The results of scoping reviews can help to inform future research by identifying gaps in the existing literature and also can be used to identify areas where there may be a sufficient depth of literature to warrant a systematic review. Systematic reviews are intended to address a specific question by identifying and summarizing all of the available research that has addressed the review question. Questions types that can be addressed by a systematic review include prevalence/incidence questions, and questions related to etiology, intervention efficacy, and diagnostic test accuracy. The systematic review process follows structured steps with multiple reviewers working in parallel to reduce the potential for bias. An extensive literature search is undertaken and, for each relevant study identified by the search, a formal extraction of data, including the effect size, and assessment of the risk of bias is performed. The results from multiple studies can be combined using meta-analysis. Meta-analysis provides a summary effect size, and allows heterogeneity of effect among studies to be quantified and explored. These evidence synthesis approaches can provide scientific input to evidence-based clinical decision-making for veterinarians and regulatory bodies, and also can be useful for identifying gaps in the literature to enhance the efficiency of future research in a topic area.

Project description:BACKGROUND: With the development of high throughput methods of gene analyses, there is a growing need for mining tools to retrieve relevant articles in PubMed. As PubMed grows, literature searches become more complex and time-consuming. Automated search tools with good precision and recall are necessary. We developed GO2PUB to automatically enrich PubMed queries with gene names, symbols and synonyms annotated by a GO term of interest or one of its descendants. RESULTS: GO2PUB enriches PubMed queries based on selected GO terms and keywords. It processes the result and displays the PMID, title, authors, abstract and bibliographic references of the articles. Gene names, symbols and synonyms that have been generated as extra keywords from the GO terms are also highlighted. GO2PUB is based on a semantic expansion of PubMed queries using the semantic inheritance between terms through the GO graph. Two experts manually assessed the relevance of GO2PUB, GoPubMed and PubMed on three queries about lipid metabolism. Experts' agreement was high (kappa?=?0.88). GO2PUB returned 69% of the relevant articles, GoPubMed: 40% and PubMed: 29%. GO2PUB and GoPubMed have 17% of their results in common, corresponding to 24% of the total number of relevant results. 70% of the articles returned by more than one tool were relevant. 36% of the relevant articles were returned only by GO2PUB, 17% only by GoPubMed and 14% only by PubMed. For determining whether these results can be generalized, we generated twenty queries based on random GO terms with a granularity similar to those of the first three queries and compared the proportions of GO2PUB and GoPubMed results. These were respectively of 77% and 40% for the first queries, and of 70% and 38% for the random queries. The two experts also assessed the relevance of seven of the twenty queries (the three related to lipid metabolism and four related to other domains). Expert agreement was high (0.93 and 0.8). GO2PUB and GoPubMed performances were similar to those of the first queries. CONCLUSIONS: We demonstrated that the use of genes annotated by either GO terms of interest or a descendant of these GO terms yields some relevant articles ignored by other tools. The comparison of GO2PUB, based on semantic expansion, with GoPubMed, based on text mining techniques, showed that both tools are complementary. The analysis of the randomly-generated queries suggests that the results obtained about lipid metabolism can be generalized to other biological processes. GO2PUB is available at http://go2pub.genouest.org.

Dataset Information

Exploring semantic deep learning for building reliable and reusable one health knowledge from PubMed systematic reviews and veterinary clinical notes.

Background

Results

Conclusions

Publications

Exploring semantic deep learning for building reliable and reusable one health knowledge from PubMed systematic reviews and veterinary clinical notes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets