ABSTRACT: The analysis of differentially expressed genes is a powerful approach to elucidate the genetic mechanisms underlying the morphological and evolutionary diversity among serially homologous structures, both within the same organism (e.g., hand vs. foot) and between different species (e.g., hand vs. wing). In the developing embryo, limb-specific expression of Pitx1, Tbx4, and Tbx5 regulates the determination of limb identity. However, numerous lines of evidence, including the fact that these three genes encode transcription factors, indicate that additional genes are involved in the Pitx1-Tbx hierarchy. To examine the molecular distinctions coded for by these factors, and to identify novel genes involved in the determination of limb identity, we have used Serial Analysis of Gene Expression (SAGE) to generate comprehensive gene expression profiles from intact, developing mouse forelimbs and hindlimbs. To minimize the extraction of erroneous SAGE tags from low-quality sequence data, we used a new algorithm to extract tags from -analyzed sequence data and obtained 68,406 and 68,450 SAGE tags from forelimb and hindlimb SAGE libraries, respectively. We also developed an improved method for determining the identity of SAGE tags that increases the specificity of and provides additional information about the confidence of the tag-UniGene cluster match. The most differentially expressed gene between our SAGE libraries was Pitx1. The differential expression of Tbx4, Tbx5, and several limb-specific Hox genes was also detected; however, their abundances in the SAGE libraries were low. Because numerous other tags were differentially expressed at this low level, we performed a 'virtual' subtraction with 362,344 tags from six additional nonlimb SAGE libraries to further refine this set of candidate genes. This subtraction reduced the number of candidate genes by 74%, yet preserved the previously identified regulators of limb identity. This study presents the gene expression complexity of the developing limb and identifies candidate genes involved in the regulation of limb identity. We propose that our computational tools and the overall strategy used here are broadly applicable to other SAGE-based studies in a variety of organisms. [SAGE data are all available at GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession nos. GSM55 and GSM56, which correspond to the forelimb and hindlimb raw SAGE data.]