Unknown

Dataset Information

0

DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation.


ABSTRACT: DNA-binding proteins play an important role in most cellular processes. Therefore, it is necessary to develop an efficient predictor for identifying DNA-binding proteins only based on the sequence information of proteins. The bottleneck for constructing a useful predictor is to find suitable features capturing the characteristics of DNA binding proteins. We applied PseAAC to DNA binding protein identification, and PseAAC was further improved by incorporating the evolutionary information by using profile-based protein representation. Finally, Combined with Support Vector Machines (SVMs), a predictor called iDNAPro-PseAAC was proposed. Experimental results on an updated benchmark dataset showed that iDNAPro-PseAAC outperformed some state-of-the-art approaches, and it can achieve stable performance on an independent dataset. By using an ensemble learning approach to incorporate more negative samples (non-DNA binding proteins) in the training process, the performance of iDNAPro-PseAAC was further improved. The web server of iDNAPro-PseAAC is available at http://bioinformatics.hitsz.edu.cn/iDNAPro-PseAAC/.

SUBMITTER: Liu B 

PROVIDER: S-EPMC4611492 | biostudies-literature | 2015 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation.

Liu Bin B   Wang Shanyi S   Wang Xiaolong X  

Scientific reports 20151020


DNA-binding proteins play an important role in most cellular processes. Therefore, it is necessary to develop an efficient predictor for identifying DNA-binding proteins only based on the sequence information of proteins. The bottleneck for constructing a useful predictor is to find suitable features capturing the characteristics of DNA binding proteins. We applied PseAAC to DNA binding protein identification, and PseAAC was further improved by incorporating the evolutionary information by using  ...[more]

Similar Datasets

| S-EPMC4054830 | biostudies-literature
| S-EPMC4411653 | biostudies-literature
| S-EPMC7125570 | biostudies-literature
| S-EPMC3899287 | biostudies-literature