Dataset Information

Crisis social media data labeled for storm-related information and toponym usage.

ABSTRACT: Social media provides citizens and officials with important sources of information during times of crisis. This data article makes available labeled, storm-related social media data collected over a six-hour period during a severe storm and F1 tornado that struck Central Pennsylvania on May 1st, 2017. Three datasets were collected from Twitter using location, keyword, and network filtering techniques, respectively. Only 2% of the 22,706 total tweets overlap among the datasets, providing researchers with a broader scope of information than normally available when collecting tweets using location (i.e., geotag-based) and keyword filtering alone or in combination during a crisis. Each data collection technique is described in detail, including network filtering which collects data from networks of social media users associated with a geographic area. The datasets are manually labeled for information content and toponym usage. The 22,706 tweet IDs, dehydrated for privacy, are labeled for relevance (storm-related and off-topic) and 19 types of storm-related information organized into six categories: infrastructure damage, service disruption, personal experience, weather updates, weather forecasts, and weather warnings. Data are also labeled for toponym usage (with or without toponyms), location (local, remote, and generic toponyms), and granularity (hyperlocal, municipal, and regional toponyms). The comprehensively labeled datasets provide researchers with opportunities to analyze crisis-related information behaviors and volunteered location information behaviors during a hyperlocal crisis event, as well as develop and evaluate automated filtering, geolocation, and event detection techniques that can aid citizens and crisis responders.

SUBMITTER: Grace R

PROVIDER: S-EPMC7200777 | biostudies-literature | 2020 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Crisis social media data labeled for storm-related information and toponym usage.

Grace Rob R

Data in brief 20200421

Social media provides citizens and officials with important sources of information during times of crisis. This data article makes available labeled, storm-related social media data collected over a six-hour period during a severe storm and F1 tornado that struck Central Pennsylvania on May 1<sup>st</sup>, 2017. Three datasets were collected from Twitter using location, keyword, and network filtering techniques, respectively. Only 2% of the 22,706 total tweets overlap among the datasets, providi ...[more]

PMID: 32382607

Similar Datasets

Project description:BackgroundHealth-related misinformation can be propagated via social media and is a threat to public health. Several quality assessment tools and principles to evaluate health-related information in the public domain exist; however, these were not designed specifically for social media.ObjectiveThis study aims to develop Principles for Health-related Information on Social Media (PRHISM), which can be used to evaluate the quality of health-related social media content.MethodsA modified Delphi approach was used to obtain expert consensus on the principles and functions of PRHISM. Health and social media experts were recruited via Twitter, email, and snowballing. A total of 3 surveys were administered between February 2021 and May 2021. The first survey was informed by a literature review and included open-ended questions and items from existing quality assessment tools. Subsequent surveys were informed by the results of the proceeding survey. Consensus was deemed if ≥80% agreement was reached, and items with consensus were considered relevant to include in PRHISM. After the third survey, principles were finalized, and an instruction manual and scoring tool for PRHISM were developed and circulated to expert participants for final feedback.ResultsA total of 34 experts consented to participate, of whom 18 (53%) responded to all 3 Delphi surveys. In total, 13 principles were considered relevant and were included in PRHISM. When the instructions and PRHISM scoring tool were circulated, no objections to the wording of the final principles were received.ConclusionsA total of 13 quality principles were included in the PRHISM tool, along with a scoring system and implementation tool. The principles promote accessibility, transparency, provision of authoritative and evidence-based information and support for consumers' relationships with health care providers. PRHISM can be used to evaluate the quality of health-related information provided on social media. These principles may also be useful to content creators for developing high-quality health-related social media content and assist consumers in discerning high- and low-quality information.

Dataset Information

Crisis social media data labeled for storm-related information and toponym usage.

Publications

Crisis social media data labeled for storm-related information and toponym usage.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets