Dataset Information

ChSeq: A database of chameleon sequences.

ABSTRACT: Chameleon sequences (ChSeqs) refer to sequence strings of identical amino acids that can adopt different conformations in protein structures. Researchers have detected and studied ChSeqs to understand the interplay between local and global interactions in protein structure formation. The different secondary structures adopted by one ChSeq challenge sequence-based secondary structure predictors. With increasing numbers of available Protein Data Bank structures, we here identify a large set of ChSeqs ranging from 6 to 10 residues in length. The homologous ChSeqs discovered highlight the structural plasticity involved in biological function. When compared with previous studies, the set of unrelated ChSeqs found represents an about 20-fold increase in the number of detected sequences, as well as an increase in the longest ChSeq length from 8 to 10 residues. We applied secondary structure predictors on our ChSeqs and found that methods based on a sequence profile outperformed methods based on a single sequence. For the unrelated ChSeqs, the evolutionary information provided by the sequence profile typically allows successful prediction of the prevailing secondary structure adopted in each protein family. Our dataset will facilitate future studies of ChSeqs, as well as interpretations of the interplay between local and nonlocal interactions. A user-friendly web interface for this ChSeq database is available at prodata.swmed.edu/chseq.

SUBMITTER: Li W

PROVIDER: S-EPMC4500308 | biostudies-literature | 2015 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

ChSeq: A database of chameleon sequences.

Li Wenlin W Kinch Lisa N LN Karplus P Andrew PA Grishin Nick V NV

Protein science : a publication of the Protein Society 20150616 7

Chameleon sequences (ChSeqs) refer to sequence strings of identical amino acids that can adopt different conformations in protein structures. Researchers have detected and studied ChSeqs to understand the interplay between local and global interactions in protein structure formation. The different secondary structures adopted by one ChSeq challenge sequence-based secondary structure predictors. With increasing numbers of available Protein Data Bank structures, we here identify a large set of ChS ...[more]

PMID: 25970262

Dataset Information

ChSeq: A database of chameleon sequences.

Publications

ChSeq: A database of chameleon sequences.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Chameleon sequences in neurodegenerative diseases.
| S-EPMC7124260 | biostudies-literature

Discordant and chameleon sequences: their distribution and implications for amyloidogenicity.
| S-EPMC3064835 | biostudies-literature

NetCSSP: web application for predicting chameleon sequences and amyloid fibril formation.
| S-EPMC2703942 | biostudies-literature

Certain heptapeptide and large sequences representing an entire helix, strand or coil conformation in proteins are associated as chameleon sequences.
| S-EPMC7124434 | biostudies-literature

Visually guided avoidance in the chameleon (Chamaeleo chameleon): response patterns and lateralization.
| S-EPMC3369868 | biostudies-literature

ANTIMIC: a database of antimicrobial sequences.
| S-EPMC308766 | biostudies-literature

A Curated, Comprehensive Database of Plasmid Sequences.
| S-EPMC6318356 | biostudies-literature

RNAcentral: an international database of ncRNA sequences.
| S-EPMC4384043 | biostudies-literature

Flavitrack: an annotated database of flavivirus sequences.
| S-EPMC2629353 | biostudies-literature