Dataset Information

Text classification models for the automatic detection of nonmedical prescription medication use from social media.

ABSTRACT:

Background

Prescription medication (PM) misuse/abuse has emerged as a national crisis in the United States, and social media has been suggested as a potential resource for performing active monitoring. However, automating a social media-based monitoring system is challenging-requiring advanced natural language processing (NLP) and machine learning methods. In this paper, we describe the development and evaluation of automatic text classification models for detecting self-reports of PM abuse from Twitter.

Methods

We experimented with state-of-the-art bi-directional transformer-based language models, which utilize tweet-level representations that enable transfer learning (e.g., BERT, RoBERTa, XLNet, AlBERT, and DistilBERT), proposed fusion-based approaches, and compared the developed models with several traditional machine learning, including deep learning, approaches. Using a public dataset, we evaluated the performances of the classifiers on their abilities to classify the non-majority "abuse/misuse" class.

Results

Our proposed fusion-based model performs significantly better than the best traditional model (F₁-score [95% CI]: 0.67 [0.64-0.69] vs. 0.45 [0.42-0.48]). We illustrate, via experimentation using varying training set sizes, that the transformer-based models are more stable and require less annotated data compared to the other models. The significant improvements achieved by our best-performing classification model over past approaches makes it suitable for automated continuous monitoring of nonmedical PM use from Twitter.

Conclusions

BERT, BERT-like and fusion-based models outperform traditional machine learning and deep learning models, achieving substantial improvements over many years of past research on the topic of prescription medication misuse/abuse classification from social media, which had been shown to be a complex task due to the unique ways in which information about nonmedical use is presented. Several challenges associated with the lack of context and the nature of social media language need to be overcome to further improve BERT and BERT-like models. These experimental driven challenges are represented as potential future research directions.

SUBMITTER: Al-Garadi MA

PROVIDER: S-EPMC7835447 | biostudies-literature | 2021 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Text classification models for the automatic detection of nonmedical prescription medication use from social media.

Al-Garadi Mohammed Ali MA Yang Yuan-Chi YC Cai Haitao H Ruan Yucheng Y O'Connor Karen K Graciela Gonzalez-Hernandez GH Perrone Jeanmarie J Sarker Abeed A

BMC medical informatics and decision making 20210126 1

<h4>Background</h4>Prescription medication (PM) misuse/abuse has emerged as a national crisis in the United States, and social media has been suggested as a potential resource for performing active monitoring. However, automating a social media-based monitoring system is challenging-requiring advanced natural language processing (NLP) and machine learning methods. In this paper, we describe the development and evaluation of automatic text classification models for detecting self-reports of PM ab ...[more]

PMID: 33499852

Similar Datasets

Project description:ObjectivesLarge language models (LLMs) have demonstrated remarkable success in natural language processing (NLP) tasks. This study aimed to evaluate their performances on social media-based health-related text classification tasks.Materials and methodsWe benchmarked 1 Support Vector Machine (SVM), 3 supervised pretrained language models (PLMs), and 2 LLMs-based classifiers across 6 text classification tasks. We developed 3 approaches for leveraging LLMs: employing LLMs as zero-shot classifiers, using LLMs as data annotators, and utilizing LLMs with few-shot examples for data augmentation.ResultsAcross all tasks, the mean (SD) F1 score differences for RoBERTa, BERTweet, and SocBERT trained on human-annotated data were 0.24 (±0.10), 0.25 (±0.11), and 0.23 (±0.11), respectively, compared to those trained on the data annotated using GPT3.5, and were 0.16 (±0.07), 0.16 (±0.08), and 0.14 (±0.08) using GPT4, respectively. The GPT3.5 and GPT4 zero-shot classifiers outperformed SVMs in a single task and in 5 out of 6 tasks, respectively. When leveraging LLMs for data augmentation, the RoBERTa models trained on GPT4-augmented data demonstrated superior or comparable performance compared to those trained on human-annotated data alone.DiscussionThe results revealed that using LLM-annotated data only for training supervised classification models was ineffective. However, employing the LLM as a zero-shot classifier exhibited the potential to outperform traditional SVM models and achieved a higher recall than the advanced transformer-based model RoBERTa. Additionally, our results indicated that utilizing GPT3.5 for data augmentation could potentially harm model performance. In contrast, data augmentation with GPT4 demonstrated improved model performances, showcasing the potential of LLMs in reducing the need for extensive training data.ConclusionsBy leveraging the data augmentation strategy, we can harness the power of LLMs to develop smaller, more effective domain-specific NLP models. Using LLM-annotated data without human guidance for training lightweight supervised classification models is an ineffective strategy. However, LLM, as a zero-shot classifier, shows promise in excluding false negatives and potentially reducing the human effort required for data annotation.

Project description:Unstructured clinical narratives are continuously being recorded as part of delivery of care in electronic health records, and dedicated tagging staff spend considerable effort manually assigning clinical codes for billing purposes. Despite these efforts, however, label availability and accuracy are both suboptimal. In this retrospective study, we aimed to automate the assignment of top-level International Classification of Diseases version 9 (ICD-9) codes to clinical records from human and veterinary data stores using minimal manual labor and feature curation. Automating top-level annotations could in turn enable rapid cohort identification, especially in a veterinary setting. To this end, we trained long short-term memory (LSTM) recurrent neural networks (RNNs) on 52,722 human and 89,591 veterinary records. We investigated the accuracy of both separate-domain and combined-domain models and probed model portability. We established relevant baseline classification performances by training Decision Trees (DT) and Random Forests (RF). We also investigated whether transforming the data using MetaMap Lite, a clinical natural language processing tool, affected classification performance. We showed that the LSTM-RNNs accurately classify veterinary and human text narratives into top-level categories with an average weighted macro F1 score of 0.74 and 0.68 respectively. In the "neoplasia" category, the model trained on veterinary data had a high validation accuracy in veterinary data and moderate accuracy in human data, with F1 scores of 0.91 and 0.70 respectively. Our LSTM method scored slightly higher than that of the DT and RF models. The use of LSTM-RNN models represents a scalable structure that could prove useful in cohort identification for comparative oncology studies. Digitization of human and veterinary health information will continue to be a reality, particularly in the form of unstructured narratives. Our approach is a step forward for these two domains to learn from and inform one another.

Project description:BackgroundThe use of social media data provides an opportunity to complement traditional influenza and COVID-19 surveillance methods for the detection and control of outbreaks and informing public health interventions.ObjectiveThe first aim of this study is to investigate the degree to which Twitter users disclose health experiences related to influenza and COVID-19 that could be indicative of recent plausible influenza cases or symptomatic COVID-19 infections. Second, we seek to use the Twitter datasets to train and evaluate the classification performance of Bidirectional Encoder Representations from Transformers (BERT) and variant language models in the context of influenza and COVID-19 infection detection.MethodsWe constructed two Twitter datasets using a keyword-based filtering approach on English-language tweets collected from December 2016 to December 2022 in Saskatchewan, Canada. The influenza-related dataset comprised tweets filtered with influenza-related keywords from December 13, 2016, to March 17, 2018, while the COVID-19 dataset comprised tweets filtered with COVID-19 symptom-related keywords from January 1, 2020, to June 22, 2021. The Twitter datasets were cleaned, and each tweet was annotated by at least two annotators as to whether it suggested recent plausible influenza cases or symptomatic COVID-19 cases. We then assessed the classification performance of pre-trained transformer-based language models, including BERT-base, BERT-large, RoBERTa-base, RoBERT-large, BERTweet-base, BERTweet-covid-base, BERTweet-large, and COVID-Twitter-BERT (CT-BERT) models, on each dataset. To address the notable class imbalance, we experimented with both oversampling and undersampling methods.ResultsThe influenza dataset had 1129 out of 6444 (17.5%) tweets annotated as suggesting recent plausible influenza cases. The COVID-19 dataset had 924 out of 11939 (7.7%) tweets annotated as inferring recent plausible COVID-19 cases. When compared against other language models on the COVID-19 dataset, CT-BERT performed the best, supporting the highest scores for recall (94.8%), F1(94.4%), and accuracy (94.6%). For the influenza dataset, BERTweet models exhibited better performance. Our results also showed that applying data balancing techniques such as oversampling or undersampling method did not lead to improved model performance.ConclusionsUtilizing domain-specific language models for monitoring users' health experiences related to influenza and COVID-19 on social media shows improved classification performance and has the potential to supplement real-time disease surveillance.

Dataset Information

Text classification models for the automatic detection of nonmedical prescription medication use from social media.

Background

Methods

Results

Conclusions

Publications

Text classification models for the automatic detection of nonmedical prescription medication use from social media.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets