Unknown

Dataset Information

0

A probabilistic model of 3' end formation in Caenorhabditis elegans.


ABSTRACT: The 3' ends of mRNAs terminate with a poly(A) tail. This post-transcriptional modification is directed by sequence features present in the 3'-untranslated region (3'-UTR). We have undertaken a computational analysis of 3' end formation in Caenorhabditis elegans. By aligning cDNAs that diverge from genomic sequence at the poly(A) tract, we accurately identified a large set of true cleavage sites. When there are many transcripts aligned to a particular locus, local variation of the cleavage site over a span of a few bases is frequently observed. We find that in addition to the well-known AAUAAA motif there are several regions with distinct nucleotide compositional biases. We propose a generalized hidden Markov model that describes sequence features in C.elegans 3'-UTRs. We find that a computer program employing this model accurately predicts experimentally observed 3' ends even when there are multiple AAUAAA motifs and multiple cleavage sites. We have made available a complete set of polyadenylation site predictions for the C.elegans genome, including a subset of 6570 supported by aligned transcripts.

SUBMITTER: Hajarnavis A 

PROVIDER: S-EPMC443532 | biostudies-other | 2004

REPOSITORIES: biostudies-other

altmetric image

Publications

A probabilistic model of 3' end formation in Caenorhabditis elegans.

Hajarnavis Ashwin A   Korf Ian I   Durbin Richard R  

Nucleic acids research 20040624 11


The 3' ends of mRNAs terminate with a poly(A) tail. This post-transcriptional modification is directed by sequence features present in the 3'-untranslated region (3'-UTR). We have undertaken a computational analysis of 3' end formation in Caenorhabditis elegans. By aligning cDNAs that diverge from genomic sequence at the poly(A) tract, we accurately identified a large set of true cleavage sites. When there are many transcripts aligned to a particular locus, local variation of the cleavage site o  ...[more]

Similar Datasets

| S-EPMC83895 | biostudies-literature
| S-EPMC1526663 | biostudies-literature
| S-EPMC3057491 | biostudies-literature
2011-01-21 | GSE26691 | GEO
| S-EPMC5387690 | biostudies-literature
| S-EPMC5362939 | biostudies-literature
| S-EPMC2823417 | biostudies-literature
| S-EPMC2268801 | biostudies-other
| S-EPMC4905018 | biostudies-other
2011-01-21 | E-GEOD-26691 | biostudies-arrayexpress