Dataset Information

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.

ABSTRACT: BACKGROUND: Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clustering algorithms, some of which have been applied to this problem. One of the most successful clustering procedures in this context has been the Markov Cluster algorithm (MCL), which was recently shown to outperform a number of other procedures, some of which were specifically designed for partitioning protein interactions graphs. A novel promising clustering procedure termed Affinity Propagation (AP) was recently shown to be particularly effective, and much faster than other methods for a variety of problems, but has not yet been applied to partition protein interaction graphs. RESULTS: In this work we compare the performance of the Affinity Propagation (AP) and Markov Clustering (MCL) procedures. To this end we derive an unweighted network of protein-protein interactions from a set of 408 protein complexes from S. cervisiae hand curated in-house, and evaluate the performance of the two clustering algorithms in recalling the annotated complexes. In doing so the parameter space of each algorithm is sampled in order to select optimal values for these parameters, and the robustness of the algorithms is assessed by quantifying the level of complex recall as interactions are randomly added or removed to the network to simulate noise. To evaluate the performance on a weighted protein interaction graph, we also apply the two algorithms to the consolidated protein interaction network of S. cerevisiae, derived from genome scale purification experiments and to versions of this network in which varying proportions of the links have been randomly shuffled. CONCLUSION: Our analysis shows that the MCL procedure is significantly more tolerant to noise and behaves more robustly than the AP algorithm. The advantage of MCL over AP is dramatic for unweighted protein interaction graphs, as AP displays severe convergence problems on the majority of the unweighted graph versions that we tested, whereas MCL continues to identify meaningful clusters, albeit fewer of them, as the level of noise in the graph increases. MCL thus remains the method of choice for identifying protein complexes from binary interaction networks.

SUBMITTER: Vlasblom J

PROVIDER: S-EPMC2682798 | biostudies-other | 2009

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.

Vlasblom James J Wodak Shoshana J SJ

BMC bioinformatics 20090330

<h4>Background</h4>Genome scale data on protein interactions are generally represented as large networks, or graphs, where hundreds or thousands of proteins are linked to one another. Since proteins tend to function in groups, or complexes, an important goal has been to reliably identify protein complexes from these graphs. This task is commonly executed using clustering procedures, which aim at detecting densely connected regions within the interaction graphs. There exists a wealth of clusterin ...[more]

PMID: 19331680

Dataset Information

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.

Publications

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Functional clustering of immunoglobulin superfamily proteins with protein-protein interaction information calibrated hidden Markov model sequence profiles.
| S-EPMC3946809 | biostudies-literature

Defining objective clusters for rabies virus sequences using affinity propagation clustering.
| S-EPMC5794188 | biostudies-literature

Identifying functional modules in interaction networks through overlapping Markov clustering.
| S-EPMC3436797 | biostudies-literature

BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation.
| S-EPMC5345454 | biostudies-literature

Partitioning clustering algorithms for protein sequence data sets.
| S-EPMC2678123 | biostudies-literature

Hammock: a hidden Markov model-based peptide clustering algorithm to identify protein-interaction consensus motifs in large datasets.
| S-EPMC4681989 | biostudies-literature

Definition of customer requirements in big data using word vectors and affinity propagation clustering.
| S-EPMC8494268 | biostudies-literature

Defining objective clusters for rabies virus sequences using affinity propagation clustering
| PRJEB22369 | ENA

PANDA: Protein function prediction using domain architecture and affinity propagation.
| S-EPMC5823857 | biostudies-literature

Identification of functional networks in resting state fMRI data using adaptive sparse representation and affinity propagation clustering.
| S-EPMC4607787 | biostudies-literature