Dataset Information

Optimal clustering for detecting near-native conformations in protein docking.

ABSTRACT: Clustering is one of the most powerful tools in computational biology. The conventional wisdom is that events that occur in clusters are probably not random. In protein docking, the underlying principle is that clustering occurs because long-range electrostatic and/or desolvation forces steer the proteins to a low free-energy attractor at the binding region. Something similar occurs in the docking of small molecules, although in this case shorter-range van der Waals forces play a more critical role. Based on the above, we have developed two different clustering strategies to predict docked conformations based on the clustering properties of a uniform sampling of low free-energy protein-protein and protein-small molecule complexes. We report on significant improvements in the automated prediction and discrimination of docked conformations by using the cluster size and consensus as a ranking criterion. We show that the success of clustering depends on identifying the appropriate clustering radius of the system. The clustering radius for protein-protein complexes is consistent with the range of the electrostatics and desolvation free energies (i.e., between 4 and 9 Angstroms); for protein-small molecule docking, the radius is set by van der Waals interactions (i.e., at approximately 2 Angstroms). Without any a priori information, a simple analysis of the histogram of distance separations between the set of docked conformations can evaluate the clustering properties of the data set. Clustering is observed when the histogram is bimodal. Data clustering is optimal if one chooses the clustering radius to be the minimum after the first peak of the bimodal distribution. We show that using this optimal radius further improves the discrimination of near-native complex structures.

SUBMITTER: Kozakov D

PROVIDER: S-EPMC1366636 | biostudies-literature | 2005 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Optimal clustering for detecting near-native conformations in protein docking.

Kozakov Dima D Clodfelter Karl H KH Vajda Sandor S Camacho Carlos J CJ

Biophysical journal 20050520 2

Clustering is one of the most powerful tools in computational biology. The conventional wisdom is that events that occur in clusters are probably not random. In protein docking, the underlying principle is that clustering occurs because long-range electrostatic and/or desolvation forces steer the proteins to a low free-energy attractor at the binding region. Something similar occurs in the docking of small molecules, although in this case shorter-range van der Waals forces play a more critical r ...[more]

PMID: 15908573

Similar Datasets

Project description:The protein docking problem has two major aspects: sampling conformations and orientations, and scoring them for fit. To investigate the extent to which the protein docking problem may be attributed to the sampling of ligand side-chain conformations, multiple conformations of multiple residues were calculated for the uncomplexed (unbound) structures of protein ligands. These ligand conformations were docked into both the complexed (bound) and unbound conformations of the cognate receptors, and their energies were evaluated using an atomistic potential function. The following questions were considered: (1) does the ensemble of precalculated ligand conformations contain a structure similar to the bound form of the ligand? (2) Can the large number of conformations that are calculated be efficiently docked into the receptors? (3) Can near-native complexes be distinguished from non-native complexes? Results from seven test systems suggest that the precalculated ensembles do include side-chain conformations similar to those adopted in the experimental complexes. By assuming additivity among the side chains, the ensemble can be docked in less than 12 h on a desktop computer. These multiconformer dockings produce near-native complexes and also non-native complexes. When docked against the bound conformations of the receptors, the near-native complexes of the unbound ligand were always distinguishable from the non-native complexes. When docked against the unbound conformations of the receptors, the near-native dockings could usually, but not always, be distinguished from the non-native complexes. In every case, docking the unbound ligands with flexible side chains led to better energies and a better distinction between near-native and non-native fits. An extension of this algorithm allowed for docking multiple residue substitutions (mutants) in addition to multiple conformations. The rankings of the docked mutant proteins correlated with experimental binding affinities. These results suggest that sampling multiple residue conformations and residue substitutions of the unbound ligand contributes to, but does not fully provide, a solution to the protein docking problem. Conformational sampling allows a classical atomistic scoring function to be used; such a function may contribute to better selectivity between near-native and non-native complexes. Allowing for receptor flexibility may further extend these results.

Dataset Information

Optimal clustering for detecting near-native conformations in protein docking.

Publications

Optimal clustering for detecting near-native conformations in protein docking.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets