Unknown

Dataset Information

0

Purifying selection shapes the coincident SNP distribution of primate coding sequences.


ABSTRACT: Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density, and recombination rate, and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently, and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a "signature" during primate protein evolution.

SUBMITTER: Chen CY 

PROVIDER: S-EPMC4891680 | biostudies-literature | 2016 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Purifying selection shapes the coincident SNP distribution of primate coding sequences.

Chen Chia-Ying CY   Hung Li-Yuan LY   Wu Chan-Shuo CS   Chuang Trees-Juen TJ  

Scientific reports 20160603


Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is al  ...[more]

Similar Datasets

| S-EPMC4906602 | biostudies-other
| S-EPMC4111549 | biostudies-literature
| S-EPMC3575792 | biostudies-literature
| S-EPMC2928787 | biostudies-literature
| S-EPMC2724415 | biostudies-literature
| S-EPMC2666660 | biostudies-literature
| S-EPMC3172574 | biostudies-literature
2016-12-29 | GSE85337 | GEO
| S-EPMC3091326 | biostudies-literature
| S-EPMC6371618 | biostudies-literature