Dataset Information

Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic.

ABSTRACT: Anti-vaccination attitudes have been an issue since the development of the first vaccines. The increasing use of social media as a source of health information may contribute to vaccine hesitancy due to anti-vaccination content widely available on social media, including Twitter. Being able to identify anti-vaccination tweets could provide useful information for formulating strategies to reduce anti-vaccination sentiments among different groups. This study aims to evaluate the performance of different natural language processing models to identify anti-vaccination tweets that were published during the COVID-19 pandemic. We compared the performance of the bidirectional encoder representations from transformers (BERT) and the bidirectional long short-term memory networks with pre-trained GLoVe embeddings (Bi-LSTM) with classic machine learning methods including support vector machine (SVM) and naïve Bayes (NB). The results show that performance on the test set of the BERT model was: accuracy = 91.6%, precision = 93.4%, recall = 97.6%, F1 score = 95.5%, and AUC = 84.7%. Bi-LSTM model performance showed: accuracy = 89.8%, precision = 44.0%, recall = 47.2%, F1 score = 45.5%, and AUC = 85.8%. SVM with linear kernel performed at: accuracy = 92.3%, Precision = 19.5%, Recall = 78.6%, F1 score = 31.2%, and AUC = 85.6%. Complement NB demonstrated: accuracy = 88.8%, precision = 23.0%, recall = 32.8%, F1 score = 27.1%, and AUC = 62.7%. In conclusion, the BERT models outperformed the Bi-LSTM, SVM, and NB models in this task. Moreover, the BERT model achieved excellent performance and can be used to identify anti-vaccination tweets in future studies.

SUBMITTER: To QG

PROVIDER: S-EPMC8069687 | biostudies-literature | 2021 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic.

To Quyen G QG To Kien G KG Huynh Van-Anh N VN Nguyen Nhung T Q NTQ Ngo Diep T N DTN Alley Stephanie J SJ Tran Anh N Q ANQ Tran Anh N P ANP Pham Ngan T T NTT Bui Thanh X TX Vandelanotte Corneel C

International journal of environmental research and public health 20210412 8

Anti-vaccination attitudes have been an issue since the development of the first vaccines. The increasing use of social media as a source of health information may contribute to vaccine hesitancy due to anti-vaccination content widely available on social media, including Twitter. Being able to identify anti-vaccination tweets could provide useful information for formulating strategies to reduce anti-vaccination sentiments among different groups. This study aims to evaluate the performance of dif ...[more]

PMID: 33921539

Dataset Information

Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic.

Publications

Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Leveraging deep learning to detect stance in Spanish tweets on COVID-19 vaccination.
| S-EPMC11854073 | biostudies-literature

The Hidden Pandemic of Family Violence During COVID-19: Unsupervised Learning of Tweets.
| S-EPMC7652592 | biostudies-literature

Anti-intellectualism amid the COVID-19 pandemic: The discursive elements and sources of anti-Fauci tweets.
| S-EPMC9892881 | biostudies-literature

A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis.
| S-EPMC7906356 | biostudies-literature

Machine-learning approaches to identify determining factors of happiness during the COVID-19 pandemic: retrospective cohort study.
| S-EPMC9764099 | biostudies-literature

TClustVID: A novel machine learning classification model to investigate topics and sentiment in COVID-19 tweets.
| S-EPMC8099549 | biostudies-literature

Using machine learning probabilities to identify effects of COVID-19.
| S-EPMC10724367 | biostudies-literature

Understanding the vaccine stance of Italian tweets and addressing language changes through the COVID-19 pandemic: Development and validation of a machine learning model.
| S-EPMC9372360 | biostudies-literature

Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques.
| S-EPMC10644111 | biostudies-literature

Applying machine learning to identify autistic adults using imitation: An exploratory study.
| S-EPMC5558936 | biostudies-literature