Unknown

Dataset Information

0

Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study.


ABSTRACT: Sequence comparison is critical for the functional assignment of newly identified protein genes. As uncharacterized protein sequences accumulate, there is an increasing need for sensitive tools for their classification. Here, we present a novel multidimensional scaling pipeline, PaSiMap, which creates a map of pairwise sequence similarities. Uniquely, PaSiMap distinguishes between unique and shared features, allowing for a distinct view of protein-sequence relationships. We demonstrate PaSiMap's efficiency in detecting sequence groups and outliers using titin's 169 immunoglobulin (Ig) domains. We show that Ig domain similarity is hierarchical, being firstly determined by chain location, then by the loop features of the Ig fold and, finally, by super-repeat position. The existence of a previously unidentified domain repeat in the distal, constitutive I-band is revealed. Prototypic Igs, plus notable outliers, are identified and thereby domain classification improved. This re-classification can now guide future molecular research. In summary, we demonstrate that PaSiMap is a sensitive tool for the classification of protein sequences, which adds a new perspective in the understanding of inter-protein relationships. PaSiMap is applicable to any biological system defined by a linear sequence, including polynucleotide chains.

SUBMITTER: Su K 

PROVIDER: S-EPMC9529554 | biostudies-literature | 2022

REPOSITORIES: biostudies-literature

altmetric image

Publications

Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study.

Su Kathy K   Mayans Olga O   Diederichs Kay K   Fleming Jennifer R JR  

Computational and structural biotechnology journal 20220926


Sequence comparison is critical for the functional assignment of newly identified protein genes. As uncharacterized protein sequences accumulate, there is an increasing need for sensitive tools for their classification. Here, we present a novel multidimensional scaling pipeline, PaSiMap, which creates a map of pairwise sequence similarities. Uniquely, PaSiMap distinguishes between unique and shared features, allowing for a distinct view of protein-sequence relationships. We demonstrate PaSiMap's  ...[more]

Similar Datasets

| S-EPMC5766482 | biostudies-literature
| S-EPMC7660437 | biostudies-literature
| S-EPMC3289921 | biostudies-literature
| S-EPMC4001778 | biostudies-literature
| S-EPMC2151544 | biostudies-literature
| S-EPMC2643529 | biostudies-literature
| S-EPMC5614584 | biostudies-literature
| S-EPMC2279972 | biostudies-literature
| S-EPMC1087832 | biostudies-literature
| S-EPMC10870752 | biostudies-literature