Unknown

Dataset Information

0

Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations.


ABSTRACT: Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulations. Large-scale benchmark tests showed that C-I-TASSER can fold more than twice the number of non-homologous proteins than the I-TASSER, which does not use contacts. When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB. Furthermore, it created correct folds for 85% of proteins in the SARS-CoV-2 genome, despite the quick mutation rate of the virus and sparse sequence profiles. The results demonstrated the critical importance of coupling whole-genome and metagenome-based evolutionary information with optimal structure assembly simulations for solving the problem of non-homologous protein structure prediction.

SUBMITTER: Zheng W 

PROVIDER: S-EPMC8336924 | biostudies-literature | 2021 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations.

Zheng Wei W   Zhang Chengxin C   Li Yang Y   Pearce Robin R   Bell Eric W EW   Zhang Yang Y  

Cell reports methods 20210621 3


Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulations. Large-scale benchmark tests showed that C-I-TASSER can fold more than twice the number of non-homologous proteins than the I-TASSER, which does not use contacts. When applied to a folding experi  ...[more]

Similar Datasets

| S-EPMC34490 | biostudies-literature
| S-EPMC1878469 | biostudies-literature
| S-EPMC5911180 | biostudies-literature
| S-EPMC6302667 | biostudies-literature
| S-EPMC8852347 | biostudies-literature
| S-EPMC18658 | biostudies-literature
| S-EPMC9364379 | biostudies-literature
| S-EPMC2953502 | biostudies-literature
| S-EPMC5637520 | biostudies-literature
| S-EPMC4876523 | biostudies-literature