Evolutionary dynamics of polyadenylation signals and their recognition strategies in protist
Ontology highlight
ABSTRACT: Cleavage and polyadenylation are the final steps of eukaryotic mRNA 3' end formation. The most critical element in humans and other model organisms is the poly(A) signal, an AAUAAA hexamer. We recently discovered that the deeply branching eukaryote – Giardia lamblia uses a different but well-defined poly(A) signal, AGURAA. To better characterize when this evolutionary shift in the poly(A) signal occurred, we performed direct RNA sequencing of four protists within the Metamonada supergroup and two outgroup protists. Both outgroup protists and the non-Giardia Metamonada species use the AAUAAA poly(A) signal, indicating it is the ancestral signal. In contrast, all Giardia species use the WGURAA poly(A) signal, indicating it is a derived feature within Giardia or Fornicata. The change in this ubiquitous regulatory element raises questions about the sequence features that specify genuine poly(A) sites and how to avoid premature cleavage in the coding sequence. Therefore, we used a sequence classifier, a gapped k-mer support vector machine, that could discriminate between WGURAA sites in 3'UTRs and those in the CDS (F1 = 0.97). We found that Giardia lamblia uses nucleotides directly flanking the poly(A) signal for its recognition, with downstream nucleotides being the most important. Another member of the Giardia genus, Giardia muris, uses a different strategy: almost complete depletion of WGURAA hexamers in coding sequences. Ones that remain in the coding sequence can be recognized as poly(A) signals and undergo premature cleavage. These results identify unique features of the Giardia pathogens that could be targeted for drugs and highlight the diversity and evolution of mRNA 3' end formation in eukaryotes.
ORGANISM(S): Naegleria gruberi Entamoeba histolytica Giardia intestinalis Tritrichomonas foetus Giardia muris Trichomonas vaginalis
PROVIDER: GSE260731 | GEO | 2024/08/01
REPOSITORIES: GEO
ACCESS DATA