De novo emergence, existence, and demise of a protein-coding gene in murids [mouse]
Ontology highlight
ABSTRACT: Genes, principal units of genetic information, vary in complexity and evolutionary history. Less-complex genes (e.g. long non-coding RNA (lncRNA) expressing genes), readily emerge de novo from non-genic sequences and have high evolutionary turnover. Genesis of a gene is facilitated by adoption of functional genic sequences from retrotransposon insertions. However, protein-coding sequences in extant genomes rarely lack any connection to an ancestral protein-coding sequence. Here, we describe remarkable evolution of the murine gene D6Ertd527e and its orthologs in the rodent Muroidea superfamily. The D6Ertd527e emerged in a common ancestor of mice and hamsters as an lncRNA-expressing gene. A major contributing factor was a long terminal repeat (LTR) retrotransposon insertion carrying an oocyte-specific promoter and one of the first exons of the gene. The gene survived as an oocyte-specific lncRNA in several extant rodents while in some others the gene or its expression was lost. In the ancestral lineage of Mus musculus, the gene acquired protein-coding capacity where the bulk of the coding sequence formed through CAG (AGC) trinucleotide repeat expansion and duplications. These events gave rise to a cytoplasmic serine-rich maternal protein, which has no discernable role. Knock-out of D6Ertd527e in mice affects neither fertility nor the maternal transcriptome. While this evolving gene is not showing a notable function in laboratory mice, its documented evolutionary history in Muroidea during the last ~40 million years provides a textbook example of how a several common mutation events can support de novo gene formation, evolution of protein-coding capacity, as well as gene’s demise.
ORGANISM(S): Mus musculus
PROVIDER: GSE213819 | GEO | 2022/10/31
REPOSITORIES: GEO
ACCESS DATA