Unknown

Dataset Information

0

Structural cuticular proteins from arthropods: annotation, nomenclature, and sequence characteristics in the genomics era.


ABSTRACT: The availability of whole genome sequences of several arthropods has provided new insights into structural cuticular proteins (CPs), in particular the distribution of different families, the recognition that these proteins may comprise almost 2% of the protein coding genes of some species, and the identification of features that should aid in the annotation of new genomes and EST libraries as they become available. Twelve CP families are described: CPR (named after the Rebers and Riddiford Consensus); CPF (named because it has a highly conserved region consisting of about forty-four amino acids); CPFL (like the CPFs in a conserved C-terminal region); the TWDL family, named after a picturesque phenotype of one mutant member; four families in addition to TWDL with a preponderance of low complexity sequence that are not member of the families listed above. These were named after particular diagnostic features as CPLCA, CPLCG, CPLCW, CPLCP. There are also CPG, a lepidopteran family with an abundance of glycines, the apidermin family, named after three proteins in Apis mellifera, and CPAP1 and CPAP3, named because they have features analogous to peritrophins, namely one or three chitin-binding domains. Also described are common motifs and features. Four unusual CPs are discussed in detail. Data that facilitated the analysis of sequence variation of single CP genes in natural populations are analyzed.

SUBMITTER: Willis JH 

PROVIDER: S-EPMC2872936 | biostudies-literature | 2010 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Structural cuticular proteins from arthropods: annotation, nomenclature, and sequence characteristics in the genomics era.

Willis Judith H JH  

Insect biochemistry and molecular biology 20100218 3


The availability of whole genome sequences of several arthropods has provided new insights into structural cuticular proteins (CPs), in particular the distribution of different families, the recognition that these proteins may comprise almost 2% of the protein coding genes of some species, and the identification of features that should aid in the annotation of new genomes and EST libraries as they become available. Twelve CP families are described: CPR (named after the Rebers and Riddiford Conse  ...[more]

Similar Datasets

| S-EPMC2936398 | biostudies-literature
| S-EPMC4476932 | biostudies-literature
| S-EPMC3012931 | biostudies-literature
| S-EPMC395774 | biostudies-literature
| S-EPMC2688272 | biostudies-literature
| S-EPMC5025716 | biostudies-literature
| S-EPMC3400333 | biostudies-literature
| S-EPMC2253251 | biostudies-literature
| S-EPMC3065711 | biostudies-literature