Efficient targeted transcript discovery via array-based normalization of RACE libraries
Ontology highlight
ABSTRACT: RACE (Rapid Amplification of cDNA Ends) is a widely used approach for transcript identification. However, the dynamic range in the population of RACE transcript isoforms may be very large, and random clone selection -the typical approach- may be ineffective in sampling the different transcript species present in the population. Here, we describe an effective RACE sampling strategy. The products of the RACE reaction are hybridized onto high-density tiling arrays, and the exons detected are then used to delineate a series of RT-PCR reactions, through which the original RACE mixture is segregated into a number of simpler RT-PCR reactions. These are independently cloned, and randomly selected clones are sequenced. This approach is superior to the direct cloning and sequencing of the RACE products: it specifically targets novel transcripts, and often leads to the overall normalization of their abundances. We indeed show theoretically that this strategy leads to a very efficient sampling of the novel transcript species associated to annotated loci. In a pilot experiment, we used this approach to discover many novel transcripts for a few otherwise well-characterized protein coding genes. Finally we investigate how this strategy can be multiplexed for large-scale transcript discovery by high-density pooling of RACE reactions prior to hybridization. Our results indicate that through the interrogation of a limited number of exons per gene on a limited number of cell types, it is possible to recover a large fraction of the transcript diversity associated to protein coding loci. These loci, however, could be occupying a much larger genomic space than previously expected, implying that efficient multiplexing requires non-trivial pooling optimization.
ORGANISM(S): Homo sapiens
PROVIDER: GSE11433 | GEO | 2008/05/25
SECONDARY ACCESSION(S): PRJNA106459
REPOSITORIES: GEO
ACCESS DATA