Unknown

Dataset Information

0

A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs.


ABSTRACT: Many distantly related structure pairs exhibit structural similarities that can only be fully captured by a non-sequential alignment program. We present US-align2, a unified protocol for both sequential and non-sequential alignment of proteins and nucleic acids. On manually curated reference alignments for protein structural pairs with non-sequential relations, US-align2 achieves ≥13% higher agreement with reference alignments than existing sequential and non-sequential alignment methods. Non-sequential alignments also enabled US-align2 to have higher sensitivities in detecting RNA pairs from the same family with sequence identities <40%, obtaining ≥9% higher area under the receiver operating characteristic curve than third-party programs. The unique ability of US-align2 to parse both proteins and nucleic acids allows the method to detect protein-RNA and protein-DNA mimicries. Additionally, US-align2 performs full and semi-non-sequential alignments with at least 48% and 14% faster speed than existing programs for the same tasks, making it particularly useful for large-scale structural similarity detection.

SUBMITTER: Zhang C 

PROVIDER: S-EPMC9557024 | biostudies-literature | 2022 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs.

Zhang Chengxin C   Pyle Anna Marie AM  

iScience 20220928 10


Many distantly related structure pairs exhibit structural similarities that can only be fully captured by a non-sequential alignment program. We present US-align2, a unified protocol for both sequential and non-sequential alignment of proteins and nucleic acids. On manually curated reference alignments for protein structural pairs with non-sequential relations, US-align2 achieves ≥13% higher agreement with reference alignments than existing sequential and non-sequential alignment methods. Non-se  ...[more]

Similar Datasets

| S-EPMC2846951 | biostudies-literature
| S-EPMC6353097 | biostudies-literature
| S-EPMC5963359 | biostudies-literature
| S-EPMC3165317 | biostudies-literature
| S-EPMC4715546 | biostudies-literature
| S-EPMC2190701 | biostudies-literature
| S-EPMC4151033 | biostudies-literature
| S-EPMC2829528 | biostudies-literature
| S-EPMC1683948 | biostudies-literature
| S-EPMC5430463 | biostudies-literature