Unknown

Dataset Information

0

Sequence signatures and the probabilistic identification of proteins in the Myc-Max-Mad network.


ABSTRACT: Accurate identification of specific groups of proteins by their amino acid sequence is an important goal in genome research. Here we combine information theory with fuzzy logic search procedures to identify sequence signatures or predictive motifs for members of the Myc-Max-Mad transcription factor network. Myc is a well known oncoprotein, and this family is involved in cell proliferation, apoptosis, and differentiation. We describe a small set of amino acid sites from the N-terminal portion of the basic helix-loop-helix (bHLH) domain that provide very accurate sequence signatures for the Myc-Max-Mad transcription factor network and three of its member proteins. A predictive motif involving 28 contiguous bHLH sequence elements found 337 network proteins in the GenBank NR database with no mismatches or misidentifications. This motif also identifies at least one previously unknown fungal protein with strong affinity to the Myc-Max-Mad network. Another motif found 96% of known Myc protein sequences with only a single mismatch, including sequences from genomes previously not thought to contain Myc proteins. The predictive motif for Myc is very similar to the ancestral sequence for the Myc group estimated from phylogenetic analyses. Based on available crystal structure studies, this motif is discussed in terms of its functional consequences. Our results provide insight into evolutionary diversification of DNA binding and dimerization in a well characterized family of regulatory proteins and provide a method of identifying signature motifs in protein families.

SUBMITTER: Atchley WR 

PROVIDER: S-EPMC1088358 | biostudies-literature | 2005 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Sequence signatures and the probabilistic identification of proteins in the Myc-Max-Mad network.

Atchley William R WR   Fernandes Andrew D AD  

Proceedings of the National Academy of Sciences of the United States of America 20050425 18


Accurate identification of specific groups of proteins by their amino acid sequence is an important goal in genome research. Here we combine information theory with fuzzy logic search procedures to identify sequence signatures or predictive motifs for members of the Myc-Max-Mad transcription factor network. Myc is a well known oncoprotein, and this family is involved in cell proliferation, apoptosis, and differentiation. We describe a small set of amino acid sites from the N-terminal portion of  ...[more]

Similar Datasets

| S-EPMC2441969 | biostudies-other
| S-EPMC3828851 | biostudies-literature
| S-EPMC400484 | biostudies-literature
| S-EPMC3203627 | biostudies-literature
| S-EPMC41182 | biostudies-other
| S-EPMC5515408 | biostudies-literature
2020-06-06 | GSE146385 | GEO
| S-EPMC3974684 | biostudies-literature
| S-EPMC3065584 | biostudies-literature
2020-04-29 | GSE127212 | GEO