Unknown

Dataset Information

0

A Comprehensive Analysis of Transcript-Supported De Novo Genes in Saccharomyces sensu stricto Yeasts.


ABSTRACT: Novel genes arising from random DNA sequences (de novo genes) have been suggested to be widespread in the genomes of different organisms. However, our knowledge about the origin and evolution of de novo genes is still limited. To systematically understand the general features of de novo genes, we established a robust pipeline to analyze >20,000 transcript-supported coding sequences (CDSs) from the budding yeast Saccharomyces cerevisiae. Our analysis pipeline combined phylogeny, synteny, and sequence alignment information to identify possible orthologs across 20 Saccharomycetaceae yeasts and discovered 4,340?S. cerevisiae-specific de novo genes and 8,871?S. sensu stricto-specific de novo genes. We further combine information on CDS positions and transcript structures to show that >65% of de novo genes arose from transcript isoforms of ancient genes, especially in the upstream and internal regions of ancient genes. Fourteen identified de novo genes with high transcript levels were chosen to verify their protein expressions. Ten of them, including eight transcript isoform-associated CDSs, showed translation signals and five proteins exhibited specific cytosolic localizations. Our results suggest that de novo genes frequently arise in the S. sensu stricto complex and have the potential to be quickly integrated into ancient cellular network.

SUBMITTER: Lu TC 

PROVIDER: S-EPMC5850716 | biostudies-literature | 2017 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Comprehensive Analysis of Transcript-Supported De Novo Genes in Saccharomyces sensu stricto Yeasts.

Lu Tzu-Chiao TC   Leu Jun-Yi JY   Lin Wen-Chang WC  

Molecular biology and evolution 20171101 11


Novel genes arising from random DNA sequences (de novo genes) have been suggested to be widespread in the genomes of different organisms. However, our knowledge about the origin and evolution of de novo genes is still limited. To systematically understand the general features of de novo genes, we established a robust pipeline to analyze >20,000 transcript-supported coding sequences (CDSs) from the budding yeast Saccharomyces cerevisiae. Our analysis pipeline combined phylogeny, synteny, and sequ  ...[more]

Similar Datasets

| PRJNA390293 | ENA
| S-EPMC5666360 | biostudies-literature
| S-EPMC6379720 | biostudies-literature
| S-EPMC5558958 | biostudies-literature
| S-EPMC5850487 | biostudies-literature
| S-EPMC7510981 | biostudies-literature
| S-EPMC106574 | biostudies-literature
| PRJNA278671 | ENA
2006-03-31 | GSE3406 | GEO
| S-EPMC10055226 | biostudies-literature