Unknown

Dataset Information

0

Sampling multiple scoring functions can improve protein loop structure prediction accuracy.


ABSTRACT: Accurately predicting loop structures is important for understanding functions of many proteins. In order to obtain loop models with high accuracy, efficiently sampling the loop conformation space to discover reasonable structures is a critical step. In loop conformation sampling, coarse-grain energy (scoring) functions coupling with reduced protein representations are often used to reduce the number of degrees of freedom as well as sampling computational time. However, due to implicitly considering many factors by reduced representations, the coarse-grain scoring functions may have potential insensitivity and inaccuracy, which can mislead the sampling process and consequently ignore important loop conformations. In this paper, we present a new computational sampling approach to obtain reasonable loop backbone models, so-called the Pareto optimal sampling (POS) method. The rationale of the POS method is to sample the function space of multiple, carefully selected scoring functions to discover an ensemble of diversified structures yielding Pareto optimality to all sampled conformations. The POS method can efficiently tolerate insensitivity and inaccuracy in individual scoring functions and thereby lead to significant accuracy improvement in loop structure prediction. We apply the POS method to a set of 4-12-residue loop targets using a function space composed of backbone-only Rosetta and distance-scale finite ideal-gas reference (DFIRE) and a triplet backbone dihedral potential developed in our lab. Our computational results show that in 501 out of 502 targets, the model sets generated by POS contain structure models are within subangstrom resolution. Moreover, the top-ranked models have a root mean square deviation (rmsd) less than 1 A in 96.8, 84.1, and 72.2% of the short (4-6 residues), medium (7-9 residues), and long (10-12 residues) targets, respectively, when the all-atom models are generated by local optimization from the backbone models and are ranked by our recently developed Pareto optimal consensus (POC) method. Similar sampling effectiveness can also be found in a set of 13-residue loop targets.

SUBMITTER: Li Y 

PROVIDER: S-EPMC3211142 | biostudies-literature | 2011 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Sampling multiple scoring functions can improve protein loop structure prediction accuracy.

Li Yaohang Y   Rata Ionel I   Jakobsson Eric E  

Journal of chemical information and modeling 20110708 7


Accurately predicting loop structures is important for understanding functions of many proteins. In order to obtain loop models with high accuracy, efficiently sampling the loop conformation space to discover reasonable structures is a critical step. In loop conformation sampling, coarse-grain energy (scoring) functions coupling with reduced protein representations are often used to reduce the number of degrees of freedom as well as sampling computational time. However, due to implicitly conside  ...[more]

Similar Datasets

| S-EPMC5871981 | biostudies-literature
| S-EPMC2553011 | biostudies-literature
| S-EPMC9197983 | biostudies-literature
| S-EPMC9312937 | biostudies-literature
| S-EPMC2677743 | biostudies-literature
| S-EPMC3998890 | biostudies-literature
| S-EPMC9855734 | biostudies-literature
| S-EPMC2868011 | biostudies-literature
| S-EPMC8963302 | biostudies-literature
| S-EPMC6138000 | biostudies-literature