Unknown

Dataset Information

0

A high-throughput predictive method for sequence-similar fold switchers.


ABSTRACT: Although most experimentally characterized proteins with similar sequences assume the same folds and perform similar functions, an increasing number of exceptions is emerging. One class of exceptions comprises sequence-similar fold switchers, whose secondary structures shift from α-helix <-> β-sheet through a small number of mutations, a sequence insertion, or a deletion. Predictive methods for identifying sequence-similar fold switchers are desirable because some are associated with disease and/or can perform different functions in cells. Here, we use homology-based secondary structure predictions to identify sequence-similar fold switchers from their amino acid sequences alone. To do this, we predicted the secondary structures of sequence-similar fold switchers using three different homology-based secondary structure predictors: PSIPRED, JPred4, and SPIDER3. We found that α-helix <-> β-strand prediction discrepancies from JPred4 discriminated between the different conformations of sequence-similar fold switchers with high statistical significance (P < 1.8*10-19 ). Thus, we used these discrepancies as a classifier and found that they can often robustly discriminate between sequence-similar fold switchers and sequence-similar proteins that maintain the same folds (Matthews Correlation Coefficient of 0.82). We found that JPred4 is a more robust predictor of sequence-similar fold switchers because of (a) the curated sequence database it uses to produce multiple sequence alignments and (b) its use of sequence profiles based on Hidden Markov Models. Our results indicate that inconsistencies between JPred4 secondary structure predictions can be used to identify some sequence-similar fold switchers from their sequences alone. Thus, the negative information from inconsistent secondary structure predictions can potentially be leveraged to identify sequence-similar fold switchers from the broad base of genomic sequences.

SUBMITTER: Kim AK 

PROVIDER: S-EPMC8404102 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

2017-02-14 | GSE81094 | GEO
2017-07-20 | GSE99866 | GEO
| S-EPMC5327727 | biostudies-literature
2013-02-12 | E-ERAD-77 | biostudies-arrayexpress
| S-EPMC4327316 | biostudies-literature
| S-EPMC5845169 | biostudies-literature
2022-01-17 | GSE126546 | GEO
| S-EPMC2373454 | biostudies-literature
| S-EPMC1513610 | biostudies-literature
| S-EPMC6540676 | biostudies-literature