Unknown

Dataset Information

0

5' Long serial analysis of gene expression (LongSAGE) and 3' LongSAGE for transcriptome characterization and genome annotation.


ABSTRACT: Complete genome annotation relies on precise identification of transcription units bounded by a transcription initiation site (TIS) and a polyadenylation site (PAS). To facilitate this process, we developed a set of two complementary methods, 5' Long serial analysis of gene expression (LS) and 3'LS. These analyses are based on the original SAGE and LS methods coupled with full-length cDNA cloning, and enable the high-throughput extraction of the first and the last 20 bp of each transcript. We demonstrate that the mapping of 5'LS and 3'LS tags to the genome allows the localization of TIS and PAS. By using 537 tag pairs mapping to the region of known genes, we confirmed that >90% of the tag pairs appropriately assigned to the first and last exons. Moreover, by using tag sequences as primers for RT-PCRs, we were able to recover putative full-length transcripts in 81% of the attempts. This large-scale generation of transcript terminal tags is at least 20-40 times more efficient than full-length cDNA cloning and sequencing in the identification of complete transcription units. The apparent precision and deep coverage makes 5'LS and 3'LS an advanced approach for genome annotation through whole-transcriptome characterization.

SUBMITTER: Wei CL 

PROVIDER: S-EPMC511040 | biostudies-literature | 2004 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

5' Long serial analysis of gene expression (LongSAGE) and 3' LongSAGE for transcriptome characterization and genome annotation.

Wei Chia-Lin CL   Ng Patrick P   Chiu Kuo Ping KP   Wong Chee Hong CH   Ang Chin Chin CC   Lipovich Leonard L   Liu Edison T ET   Ruan Yijun Y  

Proceedings of the National Academy of Sciences of the United States of America 20040722 32


Complete genome annotation relies on precise identification of transcription units bounded by a transcription initiation site (TIS) and a polyadenylation site (PAS). To facilitate this process, we developed a set of two complementary methods, 5' Long serial analysis of gene expression (LS) and 3'LS. These analyses are based on the original SAGE and LS methods coupled with full-length cDNA cloning, and enable the high-throughput extraction of the first and the last 20 bp of each transcript. We de  ...[more]

Similar Datasets

2005-07-20 | GSE2967 | GEO
2005-07-20 | E-GEOD-2967 | biostudies-arrayexpress
2007-10-10 | GSE6009 | GEO
2007-10-09 | E-GEOD-6009 | biostudies-arrayexpress
| S-EPMC1899502 | biostudies-literature
| S-EPMC1352234 | biostudies-literature
| S-EPMC3438356 | biostudies-literature
| S-EPMC4616010 | biostudies-literature
| S-EPMC2644313 | biostudies-literature
2007-04-03 | GSE5915 | GEO