Dataset Information

Natural protein sequences are more intrinsically disordered than random sequences.

ABSTRACT: Most natural protein sequences have resulted from millions or even billions of years of evolution. How they differ from random sequences is not fully understood. Previous computational and experimental studies of random proteins generated from noncoding regions yielded inclusive results due to species-dependent codon biases and GC contents. Here, we approach this problem by investigating 10,000 sequences randomized at the amino acid level. Using well-established predictors for protein intrinsic disorder, we found that natural sequences have more long disordered regions than random sequences, even when random and natural sequences have the same overall composition of amino acid residues. We also showed that random sequences are as structured as natural sequences according to contents and length distributions of predicted secondary structure, although the structures from random sequences may be in a molten globular-like state, according to molecular dynamics simulations. The bias of natural sequences toward more intrinsic disorder suggests that natural sequences are created and evolved to avoid protein aggregation and increase functional diversity.

SUBMITTER: Yu JF

PROVIDER: S-EPMC4937073 | biostudies-literature | 2016 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Natural protein sequences are more intrinsically disordered than random sequences.

Yu Jia-Feng JF Cao Zanxia Z Yang Yuedong Y Wang Chun-Ling CL Su Zhen-Dong ZD Zhao Ya-Wei YW Wang Ji-Hua JH Zhou Yaoqi Y

Cellular and molecular life sciences : CMLS 20160122 15

Most natural protein sequences have resulted from millions or even billions of years of evolution. How they differ from random sequences is not fully understood. Previous computational and experimental studies of random proteins generated from noncoding regions yielded inclusive results due to species-dependent codon biases and GC contents. Here, we approach this problem by investigating 10,000 sequences randomized at the amino acid level. Using well-established predictors for protein intrinsic ...[more]

PMID: 26801222

Dataset Information

Natural protein sequences are more intrinsically disordered than random sequences.

Publications

Natural protein sequences are more intrinsically disordered than random sequences.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Intrinsically disordered domains deviate significantly from random sequences in mammalian proteins.
| S-EPMC2957690 | biostudies-literature

Uncovering Non-random Binary Patterns Within Sequences of Intrinsically Disordered Proteins.
| S-EPMC10178624 | biostudies-literature

The metastasis suppressor KISS1 is an intrinsically disordered protein slightly more extended than a random coil.
| S-EPMC5313212 | biostudies-literature

Protein intrinsically disordered regions have a non-random, modular architecture.
| S-EPMC10719218 | biostudies-literature

Rapid evolution of virus sequences in intrinsically disordered protein regions.
| S-EPMC4263755 | biostudies-literature

Protein Condensate Formation via Controlled Multimerization of Intrinsically Disordered Sequences.
| S-EPMC9669173 | biostudies-literature

IDP⁻CRF: Intrinsically Disordered Protein/Region Identification Based on Conditional Random Fields.
| S-EPMC6164615 | biostudies-literature

Intrinsically disordered sequences enable modulation of protein phase separation through distributed tyrosine motifs.
| S-EPMC5704491 | biostudies-literature

IDRWalker: A Random Walk Based Tool for Generating Intrinsically Disordered Regions in Large Protein Complexes.
| S-EPMC11270708 | biostudies-literature

Why do eukaryotic proteins contain more intrinsically disordered regions?
| S-EPMC6675126 | biostudies-literature