Dataset Information

Detecting COVID-19-Related Fake News Using Feature Extraction.

ABSTRACT: Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread rapidly. Spreading a piece of deceptive information may lead to anxiety, unwanted exposure to medical remedies, tricks for digital marketing, and may lead to deadly factors. Therefore, a model for detecting fake news from the news pool is essential. In this work, the dataset which is a fusion of news related to COVID-19 that has been sourced from data from several social media and news sources is used for classification. In the first step, preprocessing is performed on the dataset to remove unwanted text, then tokenization is carried out to extract the tokens from the raw text data collected from various sources. Later, feature selection is performed to avoid the computational overhead incurred in processing all the features in the dataset. The linguistic and sentiment features are extracted for further processing. Finally, several state-of-the-art machine learning algorithms are trained to classify the COVID-19-related dataset. These algorithms are then evaluated using various metrics. The results show that the random forest classifier outperforms the other classifiers with an accuracy of 88.50%.

SUBMITTER: Khan S

PROVIDER: S-EPMC8764372 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Detecting COVID-19-Related Fake News Using Feature Extraction.

Khan Suleman S Hakak Saqib S Deepa N N Prabadevi B B Dev Kapal K Trelova Silvia S

Frontiers in public health 20220104

Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread rapidly. Spreading a piece of deceptive information may lead to anxiety, unwanted exposure to medical remedies, tricks for digital marketing, and may lead to deadly factors. Therefore, a model for det ...[more]

PMID: 35059379

Similar Datasets

Project description:Disinformation (fake news) is a major problem that affects modern populations, especially in an era when information can be spread from one corner of the world to another in just one click. The diffusion of misinformation becomes more problematic when it addresses issues related to health, as it can affect people at both the individual and population levels. Through the ideas proposed by cultural evolution theory, in this study, we seek to understand the dynamics of disseminating messages (cultural traits) with untrue content (maladaptive traits). For our investigation, we used the scenario caused by the Coronavirus Disease 2019 (COVID-19) pandemic as a model. The instability caused by the pandemic provides a good model for the study of adapted and maladaptive traits, as the information can directly affect individual and population fitness. Through data collected on the Twitter platform (259,176 tweets) and using machine learning techniques and web scraping, we built a predictive model to analyze the following questions: (1) Is false information more shared? (2) Is false information more adopted? (3) Do people with social prestige influence the dissemination of maladaptive traits of COVID-19? We observed that fake news features contained in messages with false information were shared and adopted as unblemished messages. We also observed that social prestige was not a determining factor for the diffusion of maladaptive traits. Even with the ability to allow connections between individuals participating in social media, some factors such as attachment to cultural traits and the formation of social bubbles can favor isolation and decrease connectivity between individuals. Consequently, in the scenario of isolation between groups and low connectivity between individuals, there is a reduction in cultural exchange between people, which interferes with the dynamics of the selection of cultural traits. Thus, maladaptive (harmful) traits are favored and maintained in the cultural system. We also argue that the local Brazilian cultural context can be a determining factor for maintaining maladaptive traits. We conclude that in an unstable (pandemic) scenario, the information transmitted on Twitter is not reliable in relation to the increase in fitness, which may occur because of the low cultural exchange promoted by the personalization of the social network and cultural context of the population.Supplementary informationThe online version contains supplementary material available at 10.1007/s42979-021-00836-w.

Dataset Information

Detecting COVID-19-Related Fake News Using Feature Extraction.

Publications

Detecting COVID-19-Related Fake News Using Feature Extraction.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets