Unknown

Dataset Information

0

Fast protein classification by using the most significant pairs.


ABSTRACT: This study introduces a new approach to speed up the protein classification process. The basic idea is rewriting the sequences of each family by using the most significant pairs, where the total number of the pairs that can be appeared in the protein sequences is 400 different pairs. The sequence length could be reduced to 0.86, 0.91 and 0.95 by using the most 100, 200 and 300 significant pairs, respectively. The average time reduction is 0.53 %, 0.33 % and 0.22 % for 100, 200, and 300 pairs, respectively. In the three cases the suggested procedure can be adopted to speed up the testing time. However to get identical classification rate to the previous profile HMM, 300 pairs at least must be used.

SUBMITTER: Al-Daoud E 

PROVIDER: S-EPMC5698897 | biostudies-literature | 2010

REPOSITORIES: biostudies-literature

altmetric image

Publications

Fast protein classification by using the most significant pairs.

Al-Daoud Essam E  

EXCLI journal 20101004


This study introduces a new approach to speed up the protein classification process. The basic idea is rewriting the sequences of each family by using the most significant pairs, where the total number of the pairs that can be appeared in the protein sequences is 400 different pairs. The sequence length could be reduced to 0.86, 0.91 and 0.95 by using the most 100, 200 and 300 significant pairs, respectively. The average time reduction is 0.53 %, 0.33 % and 0.22 % for 100, 200, and 300 pairs, re  ...[more]

Similar Datasets

| S-EPMC3982160 | biostudies-literature
| S-EPMC4410661 | biostudies-literature
| S-EPMC3515428 | biostudies-literature
| S-EPMC7788595 | biostudies-literature
| S-EPMC6395045 | biostudies-literature
| S-EPMC6238331 | biostudies-literature
2023-08-08 | GSE237874 | GEO
| S-EPMC1370104 | biostudies-other
| S-EPMC1579235 | biostudies-literature
| S-EPMC5851940 | biostudies-literature