Dataset Information

Comment on 'MeSH-up: effective MeSH text classification for improved document retrieval'.

ABSTRACT:

SUBMITTER: Neveol A

PROVIDER: S-EPMC2765257 | biostudies-literature | 2009 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Comment on 'MeSH-up: effective MeSH text classification for improved document retrieval'.

Névéol Aurélie A Mork James G JG Aronson Alan R AR

Bioinformatics (Oxford, England) 20090811 20

PMID: 19671694

Similar Datasets

Project description:BackgroundPatient education materials given to breast cancer survivors may not be a good fit for their information needs. Needs may change over time, be forgotten, or be misreported, for a variety of reasons. An automated content analysis of survivors' postings to online health forums can identify expressed information needs over a span of time and be repeated regularly at low cost. Identifying these unmet needs can guide improvements to existing education materials and the creation of new resources.ObjectiveThe primary goals of this project are to assess the unmet information needs of breast cancer survivors from their own perspectives and to identify gaps between information needs and current education materials.MethodsThis approach employs computational methods for content modeling and supervised text classification to data from online health forums to identify explicit and implicit requests for health-related information. Potential gaps between needs and education materials are identified using techniques from information retrieval.ResultsWe provide a new taxonomy for the classification of sentences in online health forum data. 260 postings from two online health forums were selected, yielding 4179 sentences for coding. After annotation of data and training alternative one-versus-others classifiers, a random forest-based approach achieved F1 scores from 66% (Other, dataset2) to 90% (Medical, dataset1) on the primary information types. 136 expressions of need were used to generate queries to indexed education materials. Upon examination of the best two pages retrieved for each query, 12% (17/136) of queries were found to have relevant content by all coders, and 33% (45/136) were judged to have relevant content by at least one.ConclusionsText from online health forums can be analyzed effectively using automated methods. Our analysis confirms that breast cancer survivors have many information needs that are not covered by the written documents they typically receive, as our results suggest that at most a third of breast cancer survivors' questions would be addressed by the materials currently provided to them.

Project description:BackgroundBehavioral interventions such as psychotherapy are leading, evidence-based practices for a variety of problems (e.g., substance abuse), but the evaluation of provider fidelity to behavioral interventions is limited by the need for human judgment. The current study evaluated the accuracy of statistical text classification in replicating human-based judgments of provider fidelity in one specific psychotherapy--motivational interviewing (MI).MethodParticipants (n = 148) came from five previously conducted randomized trials and were either primary care patients at a safety-net hospital or university students. To be eligible for the original studies, participants met criteria for either problematic drug or alcohol use. All participants received a type of brief motivational interview, an evidence-based intervention for alcohol and substance use disorders. The Motivational Interviewing Skills Code is a standard measure of MI provider fidelity based on human ratings that was used to evaluate all therapy sessions. A text classification approach called a labeled topic model was used to learn associations between human-based fidelity ratings and MI session transcripts. It was then used to generate codes for new sessions. The primary comparison was the accuracy of model-based codes with human-based codes.ResultsReceiver operating characteristic (ROC) analyses of model-based codes showed reasonably strong sensitivity and specificity with those from human raters (range of area under ROC curve (AUC) scores: 0.62 - 0.81; average AUC: 0.72). Agreement with human raters was evaluated based on talk turns as well as code tallies for an entire session. Generated codes had higher reliability with human codes for session tallies and also varied strongly by individual code.ConclusionTo scale up the evaluation of behavioral interventions, technological solutions will be required. The current study demonstrated preliminary, encouraging findings regarding the utility of statistical text classification in bridging this methodological gap.

Dataset Information

Comment on 'MeSH-up: effective MeSH text classification for improved document retrieval'.

Publications

Comment on 'MeSH-up: effective MeSH text classification for improved document retrieval'.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets