Unknown

Dataset Information

0

A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs.


ABSTRACT: Computational prediction of nucleic acid binding sites in proteins are necessary to disentangle functional mechanisms in most biological processes and to explore the binding mechanisms. Several strategies have been proposed, but the state-of-the-art approaches display a great diversity in i) the definition of nucleic acid binding sites; ii) the training and test datasets; iii) the algorithmic methods for the prediction strategies; iv) the performance measures and v) the distribution and availability of the prediction programs. Here we report a large-scale assessment of 19 web servers and 3 stand-alone programs on 41 datasets including more than 5000 proteins derived from 3D structures of protein-nucleic acid complexes. Well-defined binary assessment criteria (specificity, sensitivity, precision, accuracy…) are applied. We found that i) the tools have been greatly improved over the years; ii) some of the approaches suffer from theoretical defects and there is still room for sorting out the essential mechanisms of binding; iii) RNA binding and DNA binding appear to follow similar driving forces and iv) dataset bias may exist in some methods.

SUBMITTER: Miao Z 

PROVIDER: S-EPMC4683125 | biostudies-literature | 2015 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs.

Miao Zhichao Z   Westhof Eric E  

PLoS computational biology 20151217 12


Computational prediction of nucleic acid binding sites in proteins are necessary to disentangle functional mechanisms in most biological processes and to explore the binding mechanisms. Several strategies have been proposed, but the state-of-the-art approaches display a great diversity in i) the definition of nucleic acid binding sites; ii) the training and test datasets; iii) the algorithmic methods for the prediction strategies; iv) the performance measures and v) the distribution and availabi  ...[more]

Similar Datasets

| S-EPMC2702145 | biostudies-literature
| S-EPMC6574206 | biostudies-literature
| S-EPMC4267612 | biostudies-literature
| S-EPMC5870574 | biostudies-literature
2017-05-27 | GSE98995 | GEO
| S-EPMC3601767 | biostudies-literature
| S-EPMC4341070 | biostudies-literature
| S-EPMC4436661 | biostudies-literature