Dataset Information

Link prediction on Twitter.

ABSTRACT: With over 300 million active users, Twitter is among the largest online news and social networking services in existence today. Open access to information on Twitter makes it a valuable source of data for research on social interactions, sentiment analysis, content diffusion, link prediction, and the dynamics behind human collective behaviour in general. Here we use Twitter data to construct co-occurrence language networks based on hashtags and based on all the words in tweets, and we use these networks to study link prediction by means of different methods and evaluation metrics. In addition to using five known methods, we propose two effective weighted similarity measures, and we compare the obtained outcomes in dependence on the selected semantic context of topics on Twitter. We find that hashtag networks yield to a large degree equal results as all-word networks, thus supporting the claim that hashtags alone robustly capture the semantic context of tweets, and as such are useful and suitable for studying the content and categorization. We also introduce ranking diagrams as an efficient tool for the comparison of the performance of different link prediction algorithms across multiple datasets. Our research indicates that successful link prediction algorithms work well in correctly foretelling highly probable links even if the information about a network structure is incomplete, and they do so even if the semantic context is rationalized to hashtags.

SUBMITTER: Martincic-Ipsic S

PROVIDER: S-EPMC5515441 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Link prediction on Twitter.

Martinčić-Ipšić Sanda S Močibob Edvin E Perc Matjaž M

PloS one 20170718 7

With over 300 million active users, Twitter is among the largest online news and social networking services in existence today. Open access to information on Twitter makes it a valuable source of data for research on social interactions, sentiment analysis, content diffusion, link prediction, and the dynamics behind human collective behaviour in general. Here we use Twitter data to construct co-occurrence language networks based on hashtags and based on all the words in tweets, and we use these ...[more]

PMID: 28719651

Dataset Information

Link prediction on Twitter.

Publications

Link prediction on Twitter.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Multimorbidity prediction using link prediction.
| S-EPMC8360941 | biostudies-literature

Drug Response Prediction as a Link Prediction Problem.
| S-EPMC5220354 | biostudies-other

What do computer scientists tweet? Analyzing the link-sharing practice on Twitter.
| S-EPMC5479540 | biostudies-literature

Infectivity enhances prediction of viral cascades in Twitter.
| S-EPMC6469756 | biostudies-literature

A link prediction approach to cancer drug sensitivity prediction.
| S-EPMC5629619 | biostudies-literature

Regional Influenza Prediction with Sampling Twitter Data and PDE Model.
| S-EPMC7037800 | biostudies-literature

Link Prediction through Deep Generative Model.
| S-EPMC7575873 | biostudies-literature

Link prediction in multiplex online social networks.
| S-EPMC5367313 | biostudies-literature

Simplicial closure and higher-order link prediction.
| S-EPMC6275482 | biostudies-literature

A Scalable Similarity-Popularity Link Prediction Method.
| S-EPMC7156691 | biostudies-literature