Dataset Information

Block models and personalized PageRank.

ABSTRACT: Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods through the "seed set expansion problem": given a subset [Formula: see text] of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of "landing probabilities" of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. In this work, we develop a principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model. We derive the optimal gradient for separating the landing probabilities of two classes in a stochastic block model and find, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter [Formula: see text] that depends on the block model parameters. This connection provides a formal motivation for the success of personalized PageRank in seed set expansion and node ranking generally. We use this connection to propose more advanced techniques incorporating higher moments of landing probabilities; our advanced methods exhibit greatly improved performance, despite being simple linear classification rules, and are even competitive with belief propagation.

SUBMITTER: Kloumann IM

PROVIDER: S-EPMC5224398 | biostudies-literature | 2017 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Block models and personalized PageRank.

Kloumann Isabel M IM Ugander Johan J Kleinberg Jon J

Proceedings of the National Academy of Sciences of the United States of America 20161220 1

Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods through the "seed set expansion problem": given a subset [Formula: see text] of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel ...[more]

PMID: 27999183

Similar Datasets

Project description:BackgroundFitness devices have spurred the development of apps that aim to motivate users, through interventions, to increase their physical activity (PA). Personalization in the interventions is essential as the target users are diverse with respect to their activity levels, requirements, preferences, and behavior.ObjectiveThis review aimed to (1) identify different kinds of personalization in interventions for promoting PA among any type of user group, (2) identify user models used for providing personalization, and (3) identify gaps in the current literature and suggest future research directions.MethodsA scoping review was undertaken by searching the databases PsycINFO, PubMed, Scopus, and Web of Science. The main inclusion criteria were (1) studies that aimed to promote PA; (2) studies that had personalization, with the intention of promoting PA through technology-based interventions; and (3) studies that described user models for personalization.ResultsThe literature search resulted in 49 eligible studies. Of these, 67% (33/49) studies focused solely on increasing PA, whereas the remaining studies had other objectives, such as maintaining healthy lifestyle (8 studies), weight loss management (6 studies), and rehabilitation (2 studies). The reviewed studies provide personalization in 6 categories: goal recommendation, activity recommendation, fitness partner recommendation, educational content, motivational content, and intervention timing. With respect to the mode of generation, interventions were found to be semiautomated or automatic. Of these, the automatic interventions were either knowledge-based or data-driven or both. User models in the studies were constructed with parameters from 5 categories: PA profile, demographics, medical data, behavior change technique (BCT) parameters, and contextual information. Only 27 of the eligible studies evaluated the interventions for improvement in PA, and 16 of these concluded that the interventions to increase PA are more effective when they are personalized.ConclusionsThis review investigates personalization in the form of recommendations or feedback for increasing PA. On the basis of the review and gaps identified, research directions for improving the efficacy of personalized interventions are proposed. First, data-driven prediction techniques can facilitate effective personalization. Second, use of BCTs in automated interventions, and in combination with PA guidelines, are yet to be explored, and preliminary studies in this direction are promising. Third, systems with automated interventions also need to be suitably adapted to serve specific needs of patients with clinical conditions. Fourth, previous user models focus on single metric evaluations of PA instead of a potentially more effective, holistic, and multidimensional view. Fifth, with the widespread adoption of activity monitoring devices and mobile phones, personalized and dynamic user models can be created using available user data, including users' social profile. Finally, the long-term effects of such interventions as well as the technology medium used for the interventions need to be evaluated rigorously.

Project description:BackgroundTherapeutic vaccination against disseminated prostate cancer (PCa) is partially effective in some PCa patients. We hypothesized that the efficacy of treatment will be enhanced by individualized vaccination regimens tailored by simple mathematical models.Methodology/principal findingsWe developed a general mathematical model encompassing the basic interactions of a vaccine, immune system and PCa cells, and validated it by the results of a clinical trial testing an allogeneic PCa whole-cell vaccine. For model validation in the absence of any other pertinent marker, we used the clinically measured changes in prostate-specific antigen (PSA) levels as a correlate of tumor burden. Up to 26 PSA levels measured per patient were divided into each patient's training set and his validation set. The training set, used for model personalization, contained the patient's initial sequence of PSA levels; the validation set contained his subsequent PSA data points. Personalized models were simulated to predict changes in tumor burden and PSA levels and predictions were compared to the validation set. The model accurately predicted PSA levels over the entire measured period in 12 of the 15 vaccination-responsive patients (the coefficient of determination between the predicted and observed PSA values was R(2)?=?0.972). The model could not account for the inconsistent changes in PSA levels in 3 of the 15 responsive patients at the end of treatment. Each validated personalized model was simulated under many hypothetical immunotherapy protocols to suggest alternative vaccination regimens. Personalized regimens predicted to enhance the effects of therapy differed among the patients.Conclusions/significanceUsing a few initial measurements, we constructed robust patient-specific models of PCa immunotherapy, which were retrospectively validated by clinical trial results. Our results emphasize the potential value and feasibility of individualized model-suggested immunotherapy protocols.

Dataset Information

Block models and personalized PageRank.

Publications

Block models and personalized PageRank.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets