Dataset Information

Linearity of network proximity measures: implications for set-based queries and significance testing.

ABSTRACT:

Motivation

In recent years, various network proximity measures have been proposed to facilitate the use of biomolecular interaction data in a broad range of applications. These applications include functional annotation, disease gene prioritization, comparative analysis of biological systems and prediction of new interactions. In such applications, a major task is the scoring or ranking of the nodes in the network in terms of their proximity to a given set of 'seed' nodes (e.g. a group of proteins that are identified to be associated with a disease, or are deferentially expressed in a certain condition). Many different network proximity measures are utilized for this purpose, and these measures are quite diverse in terms of the benefits they offer.

Results

We propose a unifying framework for characterizing network proximity measures for set-based queries. We observe that many existing measures are linear, in that the proximity of a node to a set of nodes can be represented as an aggregation of its proximity to the individual nodes in the set. Based on this observation, we propose methods for processing of set-based proximity queries that take advantage of sparse local proximity information. In addition, we provide an analytical framework for characterizing the distribution of proximity scores based on reference models that accurately capture the characteristics of the seed set (e.g. degree distribution and biological function). The resulting framework facilitates computation of exact figures for the statistical significance of network proximity scores, enabling assessment of the accuracy of Monte Carlo simulation based estimation methods.

Availability and implementation

Implementations of the methods in this paper are available at https://bioengine.case.edu/crosstalker which includes a robust visualization for results viewing.

Contact

stm@case.edu or mxk331@case.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Maxwell S

PROVIDER: S-EPMC5860101 | biostudies-literature | 2017 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Linearity of network proximity measures: implications for set-based queries and significance testing.

Maxwell Sean S Chance Mark R MR Koyutürk Mehmet M

Bioinformatics (Oxford, England) 20170501 9

<h4>Motivation</h4>In recent years, various network proximity measures have been proposed to facilitate the use of biomolecular interaction data in a broad range of applications. These applications include functional annotation, disease gene prioritization, comparative analysis of biological systems and prediction of new interactions. In such applications, a major task is the scoring or ranking of the nodes in the network in terms of their proximity to a given set of 'seed' nodes (e.g. a group o ...[more]

PMID: 28453667

Similar Datasets

Project description:BackgroundIt is unclear whether psychotic experiences (PEs) gradually merge into states of clinical psychosis along a continuum which correspond to a dimensional classification or whether latent classes appear above a certain severity threshold which correspond better to diagnostic categories of psychosis.MethodsAnnual cross-sectional surveys, 2014-19, among Chinese undergraduates (N = 47,004) measured PEs, depression and etiological risk factors using standardized self-report instruments. We created a psychosis continuum with five levels and tested linear and extra-linear contrasts in associated etiological risk factors, before and after adjustment for depression. We carried out latent class analysis.ResultsCategorical expression of psychosis, including hallucinations and delusions, nuclear symptoms, and nuclear symptoms and depression were found at severe level 5. Etiological risk factors which impacted linearly across the continuum were more common for depression. Child maltreatment impacted extra-linearly on both psychosis and depression. Family history of psychosis impacted linearly on psychosis; male sex and urban birth impacted extra-linearly and were specific for psychosis. Four latent classes were found, but only at level 5. These corresponded to nuclear schizophrenia symptoms, nuclear schizophrenia and depressive symptoms, severe depression, and an unclassified category with moderate prevalence of PEs.ConclusionQuantitative and qualitative changes in the underlying structure of psychosis were observed at the most severe level along a psychosis continuum, where four latent classes emerged. These corresponded to existing categorical classifications but require confirmation with clinical interview. PEs are non-specific and our findings suggest some are on a continuum with depression, whilst others are on a continuum with non-affective psychosis. Differing patterns of impact from etiological risk factors across the spectrum of psychopathology determine outcome at the most severe level of these continua.

Dataset Information

Linearity of network proximity measures: implications for set-based queries and significance testing.

Motivation

Results

Availability and implementation

Contact

Supplementary information

Publications

Linearity of network proximity measures: implications for set-based queries and significance testing.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets