Project description:BackgroundKey to the success of e-Science is the ability to computationally evaluate expert-composed hypotheses for validity against experimental data. Researchers face the challenge of collecting, evaluating and integrating large amounts of diverse information to compose and evaluate a hypothesis. Confronted with rapidly accumulating data, researchers currently do not have the software tools to undertake the required information integration tasks.ResultsWe present HyQue, a Semantic Web tool for querying scientific knowledge bases with the purpose of evaluating user submitted hypotheses. HyQue features a knowledge model to accommodate diverse hypotheses structured as events and represented using Semantic Web languages (RDF/OWL). Hypothesis validity is evaluated against experimental and literature-sourced evidence through a combination of SPARQL queries and evaluation rules. Inference over OWL ontologies (for type specifications, subclass assertions and parthood relations) and retrieval of facts stored as Bio2RDF linked data provide support for a given hypothesis. We evaluate hypotheses of varying levels of detail about the genetic network controlling galactose metabolism in Saccharomyces cerevisiae to demonstrate the feasibility of deploying such semantic computing tools over a growing body of structured knowledge in Bio2RDF.ConclusionsHyQue is a query-based hypothesis evaluation system that can currently evaluate hypotheses about the galactose metabolism in S. cerevisiae. Hypotheses as well as the supporting or refuting data are represented in RDF and directly linked to one another allowing scientists to browse from data to hypothesis and vice versa. HyQue hypotheses and data are available at http://semanticscience.org/projects/hyque.
Project description:We present an algorithm for the per-voxel semantic segmentation of a three-dimensional volume. At the core of our algorithm is a novel "pyramid context" feature, a descriptive representation designed such that exact per-voxel linear classification can be made extremely efficient. This feature not only allows for efficient semantic segmentation but enables other aspects of our algorithm, such as novel learned features and a stacked architecture that can reason about self-consistency. We demonstrate our technique on 3D fluorescence microscopy data of Drosophila embryos for which we are able to produce extremely accurate semantic segmentations in a matter of minutes, and for which other algorithms fail due to the size and high-dimensionality of the data, or due to the difficulty of the task.
Project description:BACKGROUND: With the advent of inexpensive assay technologies, there has been an unprecedented growth in genomics data as well as the number of databases in which it is stored. In these databases, sample annotation using ontologies and controlled vocabularies is becoming more common. However, the annotation is rarely available as Linked Data, in a machine-readable format, or for standardized queries using SPARQL. This makes large-scale reuse, or integration with other knowledge bases very difficult. METHODS: To address this challenge, we have developed the second generation of our eXframe platform, a reusable framework for creating online repositories of genomics experiments. This second generation model now publishes Semantic Web data. To accomplish this, we created an experiment model that covers provenance, citations, external links, assays, biomaterials used in the experiment, and the data collected during the process. The elements of our model are mapped to classes and properties from various established biomedical ontologies. Resource Description Framework (RDF) data is automatically produced using these mappings and indexed in an RDF store with a built-in Sparql Protocol and RDF Query Language (SPARQL) endpoint. CONCLUSIONS: Using the open-source eXframe software, institutions and laboratories can create Semantic Web repositories of their experiments, integrate it with heterogeneous resources and make it interoperable with the vast Semantic Web of biomedical knowledge.
Project description:ObjectiveTo develop a novel pharmacovigilance inferential framework to infer mechanistic explanations for asserted drug-drug interactions (DDIs) and deduce potential DDIs.Materials and methodsA mechanism-based DDI knowledge base was constructed by integrating knowledge from several existing sources at the pharmacokinetic, pharmacodynamic, pharmacogenetic, and multipathway interaction levels. A query-based framework was then created to utilize this integrated knowledge base in conjunction with 9 inference rules to infer mechanistic explanations for asserted DDIs and deduce potential DDIs.ResultsThe drug-drug interactions discovery and demystification (D3) system achieved an overall 85% recall rate in terms of inferring mechanistic explanations for the DDIs integrated into its knowledge base, while demonstrating a 61% precision rate in terms of the inference or lack of inference of mechanistic explanations for a balanced, randomly selected collection of interacting and noninteracting drug pairs.DiscussionThe successful demonstration of the D3 system's ability to confirm interactions involving well-studied drugs enhances confidence in its ability to deduce interactions involving less-studied drugs. In its demonstration, the D3 system infers putative explanations for most of its integrated DDIs. Further enhancements to this work in the future might include ranking interaction mechanisms based on likelihood of applicability, determining the likelihood of deduced DDIs, and making the framework publicly available.ConclusionThe D3 system provides an early-warning framework for augmenting knowledge of known DDIs and deducing unknown DDIs. It shows promise in suggesting interaction pathways of research and evaluation interest and aiding clinicians in evaluating and adjusting courses of drug therapy.
Project description:Lyme disease is the most common tick-borne disease in the Northern Hemisphere. Existing estimates of Lyme disease spread are delayed a year or more. We introduce Lymelight-a new method for monitoring the incidence of Lyme disease in real-time. We use a machine-learned classifier of web search sessions to estimate the number of individuals who search for possible Lyme disease symptoms in a given geographical area for two years, 2014 and 2015. We evaluate Lymelight using the official case count data from CDC and find a 92% correlation (p < 0.001) at county level. Importantly, using web search data allows us not only to assess the incidence of the disease, but also to examine the appropriateness of treatments subsequently searched for by the users. Public health implications of our work include monitoring the spread of vector-borne diseases in a timely and scalable manner, complementing existing approaches through real-time detection, which can enable more timely interventions. Our analysis of treatment searches may also help reduce misdiagnosis of the disease.
Project description:Reasoning with inconsistencies is an important issue for Semantic Web as imperfect information is unavoidable in real applications. For this, different paraconsistent approaches, due to their capacity to draw as nontrivial conclusions by tolerating inconsistencies, have been proposed to reason with inconsistent description logic knowledge bases. However, existing paraconsistent approaches are often criticized for being too skeptical. To this end, this paper presents a non-monotonic paraconsistent version of description logic reasoning, called minimally inconsistent reasoning, where inconsistencies tolerated in the reasoning are minimized so that more reasonable conclusions can be inferred. Some desirable properties are studied, which shows that the new semantics inherits advantages of both non-monotonic reasoning and paraconsistent reasoning. A complete and sound tableau-based algorithm, called multi-valued tableaux, is developed to capture the minimally inconsistent reasoning. In fact, the tableaux algorithm is designed, as a framework for multi-valued DL, to allow for different underlying paraconsistent semantics, with the mere difference in the clash conditions. Finally, the complexity of minimally inconsistent description logic reasoning is shown on the same level as the (classical) description logic reasoning.
Project description:Researchers' networks have been subject to active modeling and analysis. Earlier literature mostly focused on citation or co-authorship networks reconstructed from annotated scientific publication databases, which have several limitations. Recently, general-purpose web search engines have also been utilized to collect information about social networks. Here we reconstructed, using web search engines, a network representing the relatedness of researchers to their peers as well as to various research topics. Relatedness between researchers and research topics was characterized by visibility boost-increase of a researcher's visibility by focusing on a particular topic. It was observed that researchers who had high visibility boosts by the same research topic tended to be close to each other in their network. We calculated correlations between visibility boosts by research topics and researchers' interdisciplinarity at the individual level (diversity of topics related to the researcher) and at the social level (his/her centrality in the researchers' network). We found that visibility boosts by certain research topics were positively correlated with researchers' individual-level interdisciplinarity despite their negative correlations with the general popularity of researchers. It was also found that visibility boosts by network-related topics had positive correlations with researchers' social-level interdisciplinarity. Research topics' correlations with researchers' individual- and social-level interdisciplinarities were found to be nearly independent from each other. These findings suggest that the notion of "interdisciplinarity" of a researcher should be understood as a multi-dimensional concept that should be evaluated using multiple assessment means.
Project description:BackgroundHL7 Fast Healthcare Interoperability Resources (FHIR) is an emerging open standard for the exchange of electronic healthcare information. FHIR resources are defined in a specialized modeling language. FHIR instances can currently be represented in either XML or JSON. The FHIR and Semantic Web communities are developing a third FHIR instance representation format in Resource Description Framework (RDF). Shape Expressions (ShEx), a formal RDF data constraint language, is a candidate for describing and validating the FHIR RDF representation.ObjectiveCreate a FHIR to ShEx model transformation and assess its ability to describe and validate FHIR RDF data.MethodsWe created the methods and tools that generate the ShEx schemas modeling the FHIR to RDF specification being developed by HL7 ITS/W3C RDF Task Force, and evaluated the applicability of ShEx in the description and validation of FHIR to RDF transformations.ResultsThe ShEx models contributed significantly to workgroup consensus. Algorithmic transformations from the FHIR model to ShEx schemas and FHIR example data to RDF transformations were incorporated into the FHIR build process. ShEx schemas representing 109 FHIR resources were used to validate 511 FHIR RDF data examples from the Standards for Trial Use (STU 3) Ballot version. We were able to uncover unresolved issues in the FHIR to RDF specification and detect 10 types of errors and root causes in the actual implementation. The FHIR ShEx representations have been included in the official FHIR web pages for the STU 3 Ballot version since September 2016.DiscussionShEx can be used to define and validate the syntax of a FHIR resource, which is complementary to the use of RDF Schema (RDFS) and Web Ontology Language (OWL) for semantic validation.ConclusionShEx proved useful for describing a standard model of FHIR RDF data. The combination of a formal model and a succinct format enabled comprehensive review and automated validation.
Project description:Signal detection and management is a key activity in pharmacovigilance (PV). When a new PV signal is identified, the respective information is publicly communicated in the form of periodic newsletters or reports by organizations that monitor and investigate PV-related information (such as the World Health Organization and national PV centers). However, this type of communication does not allow for systematic access, discovery and explicit data interlinking and, therefore, does not facilitate automated data sharing and reuse. In this paper, we present OpenPVSignal, a novel ontology aiming to support the semantic enrichment and rigorous communication of PV signal information in a systematic way, focusing on two key aspects: (a) publishing signal information according to the FAIR (Findable, Accessible, Interoperable, and Re-usable) data principles, and (b) exploiting automatic reasoning capabilities upon the interlinked PV signal report data. OpenPVSignal is developed as a reusable, extendable and machine-understandable model based on Semantic Web standards/recommendations. In particular, it can be used to model PV signal report data focusing on: (a) heterogeneous data interlinking, (b) semantic and syntactic interoperability, (c) provenance tracking and (d) knowledge expressiveness. OpenPVSignal is built upon widely-accepted semantic models, namely, the provenance ontology (PROV-O), the Micropublications semantic model, the Web Annotation Data Model (WADM), the Ontology of Adverse Events (OAE) and the Time ontology. To this end, we describe the design of OpenPVSignal and demonstrate its applicability as well as the reasoning capabilities enabled by its use. We also provide an evaluation of the model against the FAIR data principles. The applicability of OpenPVSignal is demonstrated by using PV signal information published in: (a) the World Health Organization's Pharmaceuticals Newsletter, (b) the Netherlands Pharmacovigilance Centre Lareb Web site and (c) the U.S. Food and Drug Administration (FDA) Drug Safety Communications, also available on the FDA Web site.
Project description:BackgroundHerd immunity or community immunity refers to the reduced risk of infection among susceptible individuals in a population through the presence and proximity of immune individuals. Recent studies suggest that improving the understanding of community immunity may increase intentions to get vaccinated.ObjectiveThis study aims to design a web application about community immunity and optimize it based on users' cognitive and emotional responses.MethodsOur multidisciplinary team developed a web application about community immunity to communicate epidemiological evidence in a personalized way. In our application, people build their own community by creating an avatar representing themselves and 8 other avatars representing people around them, for example, their family or coworkers. The application integrates these avatars in a 2-min visualization showing how different parameters (eg, vaccine coverage, and contact within communities) influence community immunity. We predefined communication goals, created prototype visualizations, and tested four iterative versions of our visualization in a university-based human-computer interaction laboratory and community-based settings (a cafeteria, two shopping malls, and a public library). Data included psychophysiological measures (eye tracking, galvanic skin response, facial emotion recognition, and electroencephalogram) to assess participants' cognitive and affective responses to the visualization and verbal feedback to assess their interpretations of the visualization's content and messaging.ResultsAmong 110 participants across all four cycles, 68 (61.8%) were women and 38 (34.5%) were men (4/110, 3.6%; not reported), with a mean age of 38 (SD 17) years. More than half (65/110, 59.0%) of participants reported having a university-level education. Iterative changes across the cycles included adding the ability for users to create their own avatars, specific signals about who was represented by the different avatars, using color and movement to indicate protection or lack of protection from infectious disease, and changes to terminology to ensure clarity for people with varying educational backgrounds. Overall, we observed 3 generalizable findings. First, visualization does indeed appear to be a promising medium for conveying what community immunity is and how it works. Second, by involving multiple users in an iterative design process, it is possible to create a short and simple visualization that clearly conveys a complex topic. Finally, evaluating users' emotional responses during the design process, in addition to their cognitive responses, offers insights that help inform the final design of an intervention.ConclusionsVisualization with personalized avatars may help people understand their individual roles in population health. Our app showed promise as a method of communicating the relationship between individual behavior and community health. The next steps will include assessing the effects of the application on risk perception, knowledge, and vaccination intentions in a randomized controlled trial. This study offers a potential road map for designing health communication materials for complex topics such as community immunity.