Unknown

Dataset Information

0

GenTree, an integrated resource for analyzing the evolution and function of primate-specific coding genes.


ABSTRACT: The origination of new genes contributes to phenotypic evolution in humans. Two major challenges in the study of new genes are the inference of gene ages and annotation of their protein-coding potential. To tackle these challenges, we created GenTree, an integrated online database that compiles age inferences from three major methods together with functional genomic data for new genes. Genome-wide comparison of the age inference methods revealed that the synteny-based pipeline (SBP) is most suited for recently duplicated genes, whereas the protein-family-based methods are useful for ancient genes. For SBP-dated primate-specific protein-coding genes (PSGs), we performed manual evaluation based on published PSG lists and showed that SBP generated a conservative data set of PSGs by masking less reliable syntenic regions. After assessing the coding potential based on evolutionary constraint and peptide evidence from proteomic data, we curated a list of 254 PSGs with different levels of protein evidence. This list also includes 41 candidate misannotated pseudogenes that encode primate-specific short proteins. Coexpression analysis showed that PSGs are preferentially recruited into organs with rapidly evolving pathways such as spermatogenesis, immune response, mother-fetus interaction, and brain development. For brain development, primate-specific KRAB zinc-finger proteins (KZNFs) are specifically up-regulated in the mid-fetal stage, which may have contributed to the evolution of this critical stage. Altogether, hundreds of PSGs are either recruited to processes under strong selection pressure or to processes supporting an evolving novel organ.

SUBMITTER: Shao Y 

PROVIDER: S-EPMC6442393 | biostudies-literature | 2019 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

GenTree, an integrated resource for analyzing the evolution and function of primate-specific coding genes.

Shao Yi Y   Chen Chunyan C   Shen Hao H   He Bin Z BZ   Yu Daqi D   Jiang Shuai S   Zhao Shilei S   Gao Zhiqiang Z   Zhu Zhenglin Z   Chen Xi X   Fu Yan Y   Chen Hua H   Gao Ge G   Long Manyuan M   Zhang Yong E YE  

Genome research 20190312 4


The origination of new genes contributes to phenotypic evolution in humans. Two major challenges in the study of new genes are the inference of gene ages and annotation of their protein-coding potential. To tackle these challenges, we created GenTree, an integrated online database that compiles age inferences from three major methods together with functional genomic data for new genes. Genome-wide comparison of the age inference methods revealed that the synteny-based pipeline (SBP) is most suit  ...[more]

Similar Datasets

| S-EPMC5487532 | biostudies-literature
| S-EPMC4488272 | biostudies-literature
| S-EPMC3371700 | biostudies-literature
| S-EPMC1779597 | biostudies-literature
| S-EPMC7919455 | biostudies-literature
| S-EPMC5175370 | biostudies-literature
| S-EPMC7295570 | biostudies-literature
| S-EPMC3126818 | biostudies-literature
| S-EPMC5555488 | biostudies-literature
| S-EPMC2981486 | biostudies-literature