Unknown

Dataset Information

0

Measuring Similarity among Protein Sequences Using a New Descriptor.


ABSTRACT: The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends on a new graphical representation and descriptor. The graphical representation of protein sequence is a simple way to visualize protein sequences. The descriptor compresses the primary sequence into a single vector composed of only two values. Our approach gives good results with both short and long sequences within a little computation time. It is applied on nine beta globin, nine ND5 (NADH dehydrogenase subunit 5), and 24 spike protein sequences. Correlation and significance analyses are also introduced to compare our similarity/dissimilarity results with others' approaches, results, and sequence homology.

SUBMITTER: Abo-Elkhier MM 

PROVIDER: S-EPMC6893242 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Measuring Similarity among Protein Sequences Using a New Descriptor.

Abo-Elkhier Mervat M MM   Abd Elwahaab Marwa A MA   Abo El Maaty Moheb I MI  

BioMed research international 20191122


The comparison of protein sequences according to similarity is a fundamental aspect of today's biomedical research. With the developments of sequencing technologies, a large number of protein sequences increase exponentially in the public databases. Famous sequences' comparison methods are alignment based. They generally give excellent results when the sequences under study are closely related and they are time consuming. Herein, a new alignment-free method is introduced. Our technique depends o  ...[more]

Similar Datasets

| S-EPMC1976428 | biostudies-literature
| S-EPMC5680341 | biostudies-literature
| S-EPMC3227456 | biostudies-literature
| S-EPMC4068907 | biostudies-literature
| S-EPMC166169 | biostudies-literature
| S-EPMC7125777 | biostudies-literature
| S-EPMC6704681 | biostudies-literature
| S-EPMC3712219 | biostudies-literature
| S-EPMC168992 | biostudies-literature
| S-EPMC6391537 | biostudies-literature