Unknown

Dataset Information

0

ProtNN: fast and accurate protein 3D-structure classification in structural and topological space.


ABSTRACT: BACKGROUND:Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures. RESULTS:We propose ProtNN, a novel classification approach for protein 3D-structures. Given an unannotated query protein structure and a set of annotated proteins, ProtNN assigns to the query protein the class with the highest number of votes across the k nearest neighbor reference proteins, where k is a user-defined parameter. The search of the nearest neighbor annotated structures is based on a protein-graph representation model and pairwise similarities between vector embedding of the query and the reference protein structures in structural and topological spaces. CONCLUSIONS:We demonstrate through an extensive experimental evaluation that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude in runtime compared to state-of-the-art approaches.

SUBMITTER: Dhifli W 

PROVIDER: S-EPMC5034655 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

ProtNN: fast and accurate protein 3D-structure classification in structural and topological space.

Dhifli Wajdi W   Diallo Abdoulaye BanirĂ© AB  

BioData mining 20160923


<h4>Background</h4>Studying the functions and structures of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the classification of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the classification of protein structures.<h4>Results</h4>We propose ProtNN,  ...[more]

Similar Datasets

| S-EPMC5068708 | biostudies-literature
| S-EPMC1636673 | biostudies-literature
| S-EPMC4804218 | biostudies-literature
| S-EPMC6420038 | biostudies-literature
| S-EPMC5946935 | biostudies-literature
| S-EPMC2754988 | biostudies-literature
| S-EPMC6395045 | biostudies-literature
| S-EPMC4894571 | biostudies-literature
| S-EPMC3948392 | biostudies-literature
| S-EPMC8127778 | biostudies-literature