Unknown

Dataset Information

0

Benchmarking the Accuracy of AlphaFold 2 in Loop Structure Prediction.


ABSTRACT: The inhibition of protein-protein interactions is a growing strategy in drug development. In addition to structured regions, many protein loop regions are involved in protein-protein interactions and thus have been identified as potential drug targets. To effectively target such regions, protein structure is critical. Loop structure prediction is a challenging subgroup in the field of protein structure prediction because of the reduced level of conservation in protein sequences compared to the secondary structure elements. AlphaFold 2 has been suggested to be one of the greatest achievements in the field of protein structure prediction. The AlphaFold 2 predicted protein structures near the X-ray resolution in the Critical Assessment of protein Structure Prediction (CASP 14) competition in 2020. The purpose of this work is to survey the performance of AlphaFold 2 in specifically predicting protein loop regions. We have constructed an independent dataset of 31,650 loop regions from 2613 proteins (deposited after the AlphaFold 2 was trained) with both experimentally determined structures and AlphaFold 2 predicted structures. With extensive evaluation using our dataset, the results indicate that AlphaFold 2 is a good predictor of the structure of loop regions, especially for short loop regions. Loops less than 10 residues in length have an average Root Mean Square Deviation (RMSD) of 0.33 Å and an average the Template Modeling score (TM-score) of 0.82. However, we see that as the number of residues in a given loop increases, the accuracy of AlphaFold 2's prediction decreases. Loops more than 20 residues in length have an average RMSD of 2.04 Å and an average TM-score of 0.55. Such a correlation between accuracy and length of the loop is directly linked to the increase in flexibility. Moreover, AlphaFold 2 does slightly over-predict α-helices and β-strands in proteins.

SUBMITTER: Stevens AO 

PROVIDER: S-EPMC9312937 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC9278006 | biostudies-literature
| S-EPMC8371605 | biostudies-literature
| S-BSST863 | biostudies-other
| S-EPMC3211142 | biostudies-literature
| S-EPMC5408826 | biostudies-other
| S-EPMC2373940 | biostudies-literature
| S-EPMC7738749 | biostudies-literature
| S-EPMC1513386 | biostudies-literature