Unknown

Dataset Information

0

De novo emergence, existence, and demise of a protein-coding gene in murids.


ABSTRACT:

Background

Genes, principal units of genetic information, vary in complexity and evolutionary history. Less-complex genes (e.g., long non-coding RNA (lncRNA) expressing genes) readily emerge de novo from non-genic sequences and have high evolutionary turnover. Genesis of a gene may be facilitated by adoption of functional genic sequences from retrotransposon insertions. However, protein-coding sequences in extant genomes rarely lack any connection to an ancestral protein-coding sequence.

Results

We describe remarkable evolution of the murine gene D6Ertd527e and its orthologs in the rodent Muroidea superfamily. The D6Ertd527e emerged in a common ancestor of mice and hamsters most likely as a lncRNA-expressing gene. A major contributing factor was a long terminal repeat (LTR) retrotransposon insertion carrying an oocyte-specific promoter and a 5' terminal exon of the gene. The gene survived as an oocyte-specific lncRNA in several extant rodents while in some others the gene or its expression were lost. In the ancestral lineage of Mus musculus, the gene acquired protein-coding capacity where the bulk of the coding sequence formed through CAG (AGC) trinucleotide repeat expansion and duplications. These events generated a cytoplasmic serine-rich maternal protein. Knock-out of D6Ertd527e in mice has a small but detectable effect on fertility and the maternal transcriptome.

Conclusions

While this evolving gene is not showing a clear function in laboratory mice, its documented evolutionary history in Muroidea during the last ~ 40 million years provides a textbook example of how a several common mutation events can support de novo gene formation, evolution of protein-coding capacity, as well as gene's demise.

SUBMITTER: Petrzilek J 

PROVIDER: S-EPMC9733328 | biostudies-literature | 2022 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

De novo emergence, existence, and demise of a protein-coding gene in murids.

Petrzilek Jan J   Pasulka Josef J   Malik Radek R   Horvat Filip F   Kataruka Shubhangini S   Fulka Helena H   Svoboda Petr P  

BMC biology 20221208 1


<h4>Background</h4>Genes, principal units of genetic information, vary in complexity and evolutionary history. Less-complex genes (e.g., long non-coding RNA (lncRNA) expressing genes) readily emerge de novo from non-genic sequences and have high evolutionary turnover. Genesis of a gene may be facilitated by adoption of functional genic sequences from retrotransposon insertions. However, protein-coding sequences in extant genomes rarely lack any connection to an ancestral protein-coding sequence.  ...[more]

Similar Datasets

2022-10-30 | GSE213820 | GEO
2022-10-31 | GSE213819 | GEO
2022-10-31 | GSE213818 | GEO
| PRJNA882482 | ENA
| PRJNA882485 | ENA
| PRJNA882483 | ENA
| S-EPMC2390625 | biostudies-literature
| S-EPMC3213175 | biostudies-literature
| S-EPMC2845654 | biostudies-literature
| S-EPMC4829534 | biostudies-literature