Unknown

Dataset Information

0

Large-scale discovery and characterization of protein regulatory motifs in eukaryotes.


ABSTRACT: The increasing ability to generate large-scale, quantitative proteomic data has brought with it the challenge of analyzing such data to discover the sequence elements that underlie systems-level protein behavior. Here we show that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach. Using this approach, we have identified many known protein motifs, such as phosphorylation sites and localization signals, and discovered a large number of candidate elements. We estimate that ~80% of these are novel predictions in that they do not match a known motif in both sequence and biological context, suggesting that post-translational regulation of protein behavior is still largely unexplored. These predicted motifs, many of which display preferential association with specific biological pathways and non-random positioning in the linear protein sequence, provide focused hypotheses for experimental validation.

SUBMITTER: Lieber DS 

PROVIDER: S-EPMC3012054 | biostudies-literature | 2010 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Large-scale discovery and characterization of protein regulatory motifs in eukaryotes.

Lieber Daniel S DS   Elemento Olivier O   Tavazoie Saeed S  

PloS one 20101229 12


The increasing ability to generate large-scale, quantitative proteomic data has brought with it the challenge of analyzing such data to discover the sequence elements that underlie systems-level protein behavior. Here we show that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach. Using this approach, we have identified many kno  ...[more]

Similar Datasets

| S-EPMC1779301 | biostudies-literature
| S-EPMC8258673 | biostudies-literature
| S-EPMC5499015 | biostudies-literature
| S-EPMC8605023 | biostudies-literature
| S-EPMC3950668 | biostudies-literature
| S-EPMC3907043 | biostudies-literature
| S-EPMC4466703 | biostudies-literature
| S-EPMC2846625 | biostudies-literature
| S-EPMC3418279 | biostudies-literature
2024-07-24 | GSE243553 | GEO