Dataset Information

Evaluating active learning methods for annotating semantic predications.

ABSTRACT:

Objectives

This study evaluated and compared a variety of active learning strategies, including a novel strategy we proposed, as applied to the task of filtering incorrect semantic predications in SemMedDB.

Materials and methods

We evaluated 8 active learning strategies covering 3 types-uncertainty, representative, and combined-on 2 datasets of 6,000 total semantic predications from SemMedDB covering the domains of substance interactions and clinical medicine, respectively. We also designed a novel combined strategy called dynamic β that does not use hand-tuned hyperparameters. Each strategy was assessed by the Area under the Learning Curve (ALC) and the number of training examples required to achieve a target Area Under the ROC curve. We also visualized and compared the query patterns of the query strategies.

Results

All types of active learning (AL) methods beat the baseline on both datasets. Combined strategies outperformed all other methods in terms of ALC, outperforming the baseline by over 0.05 ALC for both datasets and reducing 58% annotation efforts in the best case. While representative strategies performed well, their performance was matched or outperformed by the combined methods. Our proposed AL method dynamic β shows promising ability to achieve near-optimal performance across 2 datasets.

Discussion

Our visual analysis of query patterns indicates that strategies which efficiently obtain a representative subsample perform better on this task.

Conclusion

Active learning is shown to be effective at reducing annotation costs for filtering incorrect semantic predications from SemMedDB. Our proposed AL method demonstrated promising performance.

SUBMITTER: Vasilakes J

PROVIDER: S-EPMC6367018 | biostudies-literature | 2018 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Evaluating active learning methods for annotating semantic predications.

Vasilakes Jake J Rizvi Rubina R Melton Genevieve B GB Pakhomov Serguei S Zhang Rui R

JAMIA open 20180627 2

<h4>Objectives</h4>This study evaluated and compared a variety of active learning strategies, including a novel strategy we proposed, as applied to the task of filtering incorrect semantic predications in SemMedDB.<h4>Materials and methods</h4>We evaluated 8 active learning strategies covering 3 types-uncertainty, representative, and combined-on 2 datasets of 6,000 total semantic predications from SemMedDB covering the domains of substance interactions and clinical medicine, respectively. We als ...[more]

PMID: 30740594

Similar Datasets

Project description:BackgroundNepal has achieved and sustained the elimination of leprosy as a public health problem since 2009, but 17 districts and 3 provinces with 41% (10,907,128) of Nepal's population have yet to eliminate the disease. Pediatric cases and grade-2 disabilities (G2D) indicate recent transmission and late diagnosis, respectively, which necessitate active and early case detection. This operational research was performed to identify approaches best suited for early case detection, determine community-based leprosy epidemiology, and identify hidden leprosy cases early and respond with prompt treatment.MethodsActive case detection was undertaken in two Nepali provinces with the greatest burden of leprosy, Madhesh Province (40% national cases) and Lumbini Province (18%) and at-risk prison populations in Madhesh, Lumbini and Bagmati provinces. Case detection was performed by (1) house-to-house visits among vulnerable populations (n = 26,469); (2) contact examination and tracing (n = 7608); in Madhesh and Lumbini Provinces and, (3) screening prison populations (n = 4428) in Madhesh, Lumbini and Bagmati Provinces of Nepal. Per case direct medical and non-medical costs for each approach were calculated.ResultsNew case detection rates were highest for contact tracing (250), followed by house-to-house visits (102) and prison screening (45) per 100,000 population screened. However, the cost per case identified was cheapest for house-to-house visits [Nepalese rupee (NPR) 76,500/case], followed by contact tracing (NPR 90,286/case) and prison screening (NPR 298,300/case). House-to-house and contact tracing case paucibacillary/multibacillary (PB:MB) ratios were 59:41 and 68:32; female/male ratios 63:37 and 57:43; pediatric cases 11% in both approaches; and grade-2 disabilities (G2D) 11% and 5%, respectively. Developing leprosy was not significantly different among household and neighbor contacts [odds ratios (OR) = 1.4, 95% confidence interval (CI): 0.24-5.85] and for contacts of MB versus PB cases (OR = 0.7, 95% CI 0.26-2.0). Attack rates were not significantly different among household contacts of MB cases (0.32%, 95% CI 0.07-0.94%) and PB cases (0.13%, 95% CI 0.03-0.73) (χ2 = 0.07, df = 1, P = 0.9) and neighbor contacts of MB cases (0.23%, 0.1-0.46) and PB cases (0.48%, 0.19-0.98) (χ2 = 0.8, df = 1, P = 0.7). BCG vaccination with scar presence had a significant protective effect against leprosy (OR = 0.42, 0.22-0.81).ConclusionsThe most effective case identification approach here is contact tracing, followed by house-to-house visits in vulnerable populations and screening in prisons, although house-to-house visits are cheaper. The findings suggest that hidden cases, recent transmission, and late diagnosis in the community exist and highlight the importance of early case detection.

Project description:BACKGROUND: The current paradigm of arthroscopic training lacks objective evaluation of technical ability and its adequacy is concerning given the accelerating complexity of the field. To combat insufficiencies, emphasis is shifting towards skill acquisition outside the operating room and sophisticated assessment tools. We reviewed (1) the validity of cadaver and surgical simulation in arthroscopic training, (2) the role of psychomotor analysis and arthroscopic technical ability, (3) what validated assessment tools are available to evaluate technical competency, and (4) the quantification of arthroscopic proficiency. METHODS: The Medline and Embase databases were searched for published articles in the English literature pertaining to arthroscopic competence, arthroscopic assessment and evaluation and objective measures of arthroscopic technical skill. Abstracts were independently evaluated and exclusion criteria included articles outside the scope of knee and shoulder arthroscopy as well as original articles about specific therapies, outcomes and diagnoses leaving 52 articles cited in this review. RESULTS: Simulated arthroscopic environments exhibit high levels of internal validity and consistency for simple arthroscopic tasks, however the ability to transfer complex skills to the operating room has not yet been established. Instrument and force trajectory data can discriminate between technical ability for basic arthroscopic parameters and may serve as useful adjuncts to more comprehensive techniques. There is a need for arthroscopic assessment tools for standardized evaluation and objective feedback of technical skills, yet few comprehensive instruments exist, especially for the shoulder. Opinion on the required arthroscopic experience to obtain proficiency remains guarded and few governing bodies specify absolute quantities. CONCLUSIONS: Further validation is required to demonstrate the transfer of complex arthroscopic skills from simulated environments to the operating room and provide objective parameters to base evaluation. There is a deficiency of validated assessment tools for technical competencies and little consensus of what constitutes a sufficient case volume within the arthroscopy community.

Project description:IntroductionWith the increasing number of Covid-19 cases as well as care costs, chest diseases have gained increasing interest in several communities, particularly in medical and computer vision. Clinical and analytical exams are widely recognized techniques for diagnosing and handling Covid-19 cases. However, strong detection tools can help avoid damage to chest tissues. The proposed method provides an important way to enhance the semantic segmentation process using combined potential deep learning (DL) modules to increase consistency. Based on Covid-19 CT images, this work hypothesized that a novel model for semantic segmentation might be able to extract definite graphical features of Covid-19 and afford an accurate clinical diagnosis while optimizing the classical test and saving time.MethodsCT images were collected considering different cases (normal chest CT, pneumonia, typical viral causes, and Covid-19 cases). The study presents an advanced DL method to deal with chest semantic segmentation issues. The approach employs a modified version of the U-net to enable and support Covid-19 detection from the studied images.ResultsThe validation tests demonstrated competitive results with important performance rates: Precision (90.96% ± 2.5) with an F-score of (91.08% ± 3.2), an accuracy of (93.37% ± 1.2), a sensitivity of (96.88% ± 2.8) and a specificity of (96.91% ± 2.3). In addition, the visual segmentation results are very close to the Ground truth.ConclusionThe findings of this study reveal the proof-of-principle for using cooperative components to strengthen the semantic segmentation modules for effective and truthful Covid-19 diagnosis.Implications for practiceThis paper has highlighted that DL based approach, with several modules, may be contributing to provide strong support for radiographers and physicians, and that further use of DL is required to design and implement performant automated vision systems to detect chest diseases.

Dataset Information

Evaluating active learning methods for annotating semantic predications.

Objectives

Materials and methods

Results

Discussion

Conclusion

Publications

Evaluating active learning methods for annotating semantic predications.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets