ABSTRACT: BACKGROUND: Cyclin-dependent kinases (CDKs) are a large family of proteins that function in a variety of key regulatory pathways in eukaryotic cells, including control over the cell cycle and gene transcription. Among the most important and broadly studied of these roles is reversible phosphorylation of the C-terminal domain (CTD) of RNA polymerase II, part of a complex array of CTD/protein interactions that coordinate the RNAP II transcription cycle. The RNAP CTD is strongly conserved in some groups of eukaryotes, but highly degenerate or absent in others; the reasons for these differences in stabilizing selection on CTD structure are not clear. Given the importance of reversible phosphorylation for CTD-based transcription, the distribution and evolutionary history of CDKs may be a key to understanding differences in constraints on CTD structure; however, the origins and evolutionary relationships of CTD kinases have not been investigated thoroughly. Moreover, although the functions of most CDKs are reasonably well studied in mammals and yeasts, very little is known from most other eukaryotes. RESULTS: Here we identify 123 CDK family members from animals, plants, yeasts, and four protists from which genome sequences have been completed, and 10 additional CDKs from incomplete genome sequences of organisms with known CTD sequences. Comparative genomic and phylogenetic analyses suggest that cell-cycle CDKs are present in all organisms sampled in this study. In contrast, no clear orthologs of transcription-related CDKs are identified in the most putatively ancestral eukaryotes, Trypanosoma or Giardia. Kinases involved in CTD phosphorylation, CDK7, CDK8 and CDK9, all are recovered as well-supported and distinct orthologous families, but their relationships to each other and other CDKs are not well-resolved. Significantly, clear orthologs of CDK7 and CDK8 are restricted to only those organisms belonging to groups in which the RNAP II CTD is strongly conserved. CONCLUSIONS: The apparent origins of CDK7 and CDK8, or at least their conservation as clearly recognizable orthologous families, correlate with strong stabilizing selection on RNAP II CTD structure. This suggests co-evolution of the CTD and these CTD-directed CDKs. This observation is consistent with the hypothesis that CDK7 and CDK8 originated at about the same time that the CTD was canalized as the staging platform RNAP II transcription. Alternatively, extensive CTD phosphorylation may occur in only a subset of eukaryotes and, when present, this interaction results in greater stabilizing selection on both CTD and CDK sequences. Overall, our results suggest that transcription-related kinases originated after cell-cycle related CDKs, and became more evolutionarily and functionally diverse as transcriptional complexity increased.