Unknown

Dataset Information

0

FoldX accurate structural protein-DNA binding prediction using PADA1 (Protein Assisted DNA Assembly 1).


ABSTRACT: The speed at which new genomes are being sequenced highlights the need for genome-wide methods capable of predicting protein-DNA interactions. Here, we present PADA1, a generic algorithm that accurately models structural complexes and predicts the DNA-binding regions of resolved protein structures. PADA1 relies on a library of protein and double-stranded DNA fragment pairs obtained from a training set of 2103 DNA-protein complexes. It includes a fast statistical force field computed from atom-atom distances, to evaluate and filter the 3D docking models. Using published benchmark validation sets and 212 DNA-protein structures published after 2016 we predicted the DNA-binding regions with an RMSD of <1.8 Å per residue in >95% of the cases. We show that the quality of the docked templates is compatible with FoldX protein design tool suite to identify the crystallized DNA molecule sequence as the most energetically favorable in 80% of the cases. We highlighted the biological potential of PADA1 by reconstituting DNA and protein conformational changes upon protein mutagenesis of a meganuclease and its variants, and by predicting DNA-binding regions and nucleotide sequences in proteins crystallized without DNA. These results opens up new perspectives for the engineering of DNA-protein interfaces.

SUBMITTER: Blanco JD 

PROVIDER: S-EPMC5934639 | biostudies-literature | 2018 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

FoldX accurate structural protein-DNA binding prediction using PADA1 (Protein Assisted DNA Assembly 1).

Blanco Javier Delgado JD   Radusky Leandro L   Climente-González Héctor H   Serrano Luis L  

Nucleic acids research 20180501 8


The speed at which new genomes are being sequenced highlights the need for genome-wide methods capable of predicting protein-DNA interactions. Here, we present PADA1, a generic algorithm that accurately models structural complexes and predicts the DNA-binding regions of resolved protein structures. PADA1 relies on a library of protein and double-stranded DNA fragment pairs obtained from a training set of 2103 DNA-protein complexes. It includes a fast statistical force field computed from atom-at  ...[more]

Similar Datasets

| S-EPMC2271153 | biostudies-literature
| S-EPMC4897909 | biostudies-literature
| S-EPMC3378576 | biostudies-literature
| S-EPMC2653190 | biostudies-literature
| S-EPMC3386101 | biostudies-literature
| S-EPMC4843489 | biostudies-literature
| S-EPMC9528988 | biostudies-literature
| S-EPMC4939519 | biostudies-literature
| S-EPMC7697539 | biostudies-literature
| S-EPMC10719378 | biostudies-literature