Dataset Information

Selecting high quality protein structures from diverse conformational ensembles.

ABSTRACT: Protein structure prediction encompasses two major challenges: 1), the generation of a large ensemble of high resolution structures for a given amino-acid sequence; and 2), the identification of the structure closest to the native structure for a blind prediction. In this article, we address the second challenge, by proposing what is, to our knowledge, a novel iterative traveling-salesman problem-based clustering method to identify the structures of a protein, in a given ensemble, which are closest to the native structure. The method consists of an iterative procedure, which aims at eliminating clusters of structures at each iteration, which are unlikely to be of similar fold to the native, based on a statistical analysis of cluster density and average spherical radius. The method, denoted as ICON, has been tested on four data sets: 1), 1400 proteins with high resolution decoys; 2), medium-to-low resolution decoys from Decoys 'R' Us; 3), medium-to-low resolution decoys from the first-principles approach, ASTRO-FOLD; and 4), selected targets from CASP8. The extensive tests demonstrate that ICON can identify high-quality structures in each ensemble, regardless of the resolution of conformers. In a total of 1454 proteins, with an average of 1051 conformers per protein, the conformers selected by ICON are, on an average, in the top 3.5% of the conformers in the ensemble.

SUBMITTER: Subramani A

PROVIDER: S-EPMC2749775 | biostudies-literature | 2009 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Selecting high quality protein structures from diverse conformational ensembles.

Subramani Ashwin A DiMaggio Peter A PA Floudas Christodoulos A CA

Biophysical journal 20090901 6

Protein structure prediction encompasses two major challenges: 1), the generation of a large ensemble of high resolution structures for a given amino-acid sequence; and 2), the identification of the structure closest to the native structure for a blind prediction. In this article, we address the second challenge, by proposing what is, to our knowledge, a novel iterative traveling-salesman problem-based clustering method to identify the structures of a protein, in a given ensemble, which are clos ...[more]

PMID: 19751678

Dataset Information

Selecting high quality protein structures from diverse conformational ensembles.

Publications

Selecting high quality protein structures from diverse conformational ensembles.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Protein Ensembles: How Does Nature Harness Thermodynamic Fluctuations for Life? The Diverse Functional Roles of Conformational Ensembles in the Cell.
| S-EPMC6407618 | biostudies-literature

Selecting Subpopulations of High-Quality Protein Conformers among Conformational Mixtures of Recombinant Bovine MMP-9 Solubilized from Inclusion Bodies.
| S-EPMC8001920 | biostudies-literature

Multiscale characterization of protein conformational ensembles.
| S-EPMC3164158 | biostudies-literature

Machine Learning Generation of Dynamic Protein Conformational Ensembles.
| S-EPMC10220786 | biostudies-literature

Protein conformational ensembles in function: roles and mechanisms.
| S-EPMC10619138 | biostudies-literature

Direct generation of protein conformational ensembles via machine learning.
| S-EPMC9922302 | biostudies-literature

SPEACH_AF: Sampling protein ensembles and conformational heterogeneity with Alphafold2.
| S-EPMC9436118 | biostudies-literature

Bayesian inference of protein conformational ensembles from limited structural data.
| S-EPMC6312354 | biostudies-literature

Accessing protein conformational ensembles using room-temperature X-ray crystallography.
| S-EPMC3182744 | biostudies-literature

Multilevel superposition for deciphering the conformational variability of protein ensembles.
| S-EPMC10983786 | biostudies-literature