Unknown

Dataset Information

0

Dinoflagellate Gene Structure and Intron Splice Sites in a Genomic Tandem Array.


ABSTRACT: Dinoflagellates are one of the last major lineages of eukaryotes for which little is known about genome structure and organization. We report here the sequence and gene structure of a clone isolated from a cosmid library which, to our knowledge, represents the largest contiguously sequenced, dinoflagellate genomic, tandem gene array. These data, combined with information from a large transcriptomic library, allowed a high level of confidence of every base pair call. This degree of confidence is not possible with PCR-based contigs. The sequence contains an intron-rich set of five highly expressed gene repeats arranged in tandem. One of the tandem repeat gene members contains an intron 26,372 bp long. This study characterizes a splice site consensus sequence for dinoflagellate introns. Two to nine base pairs around the 3' splice site are repeated by an identical two to nine base pairs around the 5' splice site. The 5' and 3' splice sites are in the same locations within each repeat so that the repeat is found only once in the mature mRNA. This identically repeated intron boundary sequence might be useful in gene modeling and annotation of genomes.

SUBMITTER: Mendez GS 

PROVIDER: S-EPMC5032977 | biostudies-literature | 2015 Sep-Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Dinoflagellate Gene Structure and Intron Splice Sites in a Genomic Tandem Array.

Mendez Gregory S GS   Delwiche Charles F CF   Apt Kirk E KE   Lippmeier J Casey JC  

The Journal of eukaryotic microbiology 20150608 5


Dinoflagellates are one of the last major lineages of eukaryotes for which little is known about genome structure and organization. We report here the sequence and gene structure of a clone isolated from a cosmid library which, to our knowledge, represents the largest contiguously sequenced, dinoflagellate genomic, tandem gene array. These data, combined with information from a large transcriptomic library, allowed a high level of confidence of every base pair call. This degree of confidence is  ...[more]

Similar Datasets

| S-EPMC3465430 | biostudies-literature
| S-EPMC1669710 | biostudies-literature
| S-EPMC1325015 | biostudies-literature
| S-EPMC5010894 | biostudies-other
| S-EPMC2279118 | biostudies-literature
| S-EPMC2488372 | biostudies-literature
| S-EPMC7554774 | biostudies-literature
| S-EPMC1197134 | biostudies-literature
| S-EPMC373407 | biostudies-literature
| S-EPMC8055015 | biostudies-literature