Unknown

Dataset Information

0

Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.


ABSTRACT: As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions). A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC) of 0.77 with high precision (94%) and high sensitivity (65%). We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA)] is available as an on-line server at http://sparks-lab.org.

SUBMITTER: Zhao H 

PROVIDER: S-EPMC4008587 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

Zhao Huiying H   Wang Jihua J   Zhou Yaoqi Y   Yang Yuedong Y  

PloS one 20140502 5


As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction techni  ...[more]

Similar Datasets

| S-EPMC5796408 | biostudies-literature
| S-EPMC2788376 | biostudies-literature
| S-EPMC6389706 | biostudies-literature
| S-EPMC3930195 | biostudies-literature
| S-EPMC2709252 | biostudies-literature
| S-EPMC1993824 | biostudies-literature
| S-EPMC2639300 | biostudies-literature