Unknown

Dataset Information

0

3D representations of amino acids-applications to protein sequence comparison and classification.


ABSTRACT: The amino acid sequence of a protein is the key to understanding its structure and ultimately its function in the cell. This paper addresses the fundamental issue of encoding amino acids in ways that the representation of such a protein sequence facilitates the decoding of its information content. We show that a feature-based representation in a three-dimensional (3D) space derived from amino acid substitution matrices provides an adequate representation that can be used for direct comparison of protein sequences based on geometry. We measure the performance of such a representation in the context of the protein structural fold prediction problem. We compare the results of classifying different sets of proteins belonging to distinct structural folds against classifications of the same proteins obtained from sequence alone or directly from structural information. We find that sequence alone performs poorly as a structure classifier. We show in contrast that the use of the three dimensional representation of the sequences significantly improves the classification accuracy. We conclude with a discussion of the current limitations of such a representation and with a description of potential improvements.

SUBMITTER: Li J 

PROVIDER: S-EPMC4212284 | biostudies-literature | 2014 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

3D representations of amino acids-applications to protein sequence comparison and classification.

Li Jie J   Koehl Patrice P  

Computational and structural biotechnology journal 20140801 18


The amino acid sequence of a protein is the key to understanding its structure and ultimately its function in the cell. This paper addresses the fundamental issue of encoding amino acids in ways that the representation of such a protein sequence facilitates the decoding of its information content. We show that a feature-based representation in a three-dimensional (3D) space derived from amino acid substitution matrices provides an adequate representation that can be used for direct comparison of  ...[more]

Similar Datasets

| S-EPMC7062742 | biostudies-literature
| S-EPMC5133789 | biostudies-literature
| S-EPMC4862142 | biostudies-literature
| S-EPMC10576641 | biostudies-literature
| S-EPMC3201857 | biostudies-literature
| S-EPMC6337344 | biostudies-literature
| S-EPMC9247937 | biostudies-literature
| S-EPMC9998894 | biostudies-literature
| S-EPMC7173583 | biostudies-literature
| S-EPMC8170185 | biostudies-literature