Dataset Information

Different evolutionary patterns of SNPs between domains and unassigned regions in human protein-coding sequences.

ABSTRACT: Protein evolution plays an important role in the evolution of each genome. Because of their functional nature, in general, most of their parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein evolution considered individual proteins in their entirety or compared protein-coding sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each protein of a given genome. To this end, based on PfamA annotation of all human proteins, each protein sequence can be split into two parts: domains or unassigned regions. Using this rationale, single nucleotide polymorphisms (SNPs) in protein-coding sequences from the 1000 Genomes Project were mapped according to two classifications: SNPs occurring within protein domains and those within unassigned regions. With these classifications, we found: the density of synonymous SNPs within domains is significantly greater than that of synonymous SNPs within unassigned regions; however, the density of non-synonymous SNPs shows the opposite pattern. We also found there are signatures of purifying selection on both the domain and unassigned regions. Furthermore, the selective strength on domains is significantly greater than that on unassigned regions. In addition, among all of the human protein sequences, there are 117 PfamA domains in which no SNPs are found. Our results highlight an important aspect of protein domains and may contribute to our understanding of protein evolution.

SUBMITTER: Pang E

PROVIDER: S-EPMC4875946 | biostudies-literature | 2016 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Different evolutionary patterns of SNPs between domains and unassigned regions in human protein-coding sequences.

Pang Erli E Wu Xiaomei X Lin Kui K

Molecular genetics and genomics : MGG 20160130 3

Protein evolution plays an important role in the evolution of each genome. Because of their functional nature, in general, most of their parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein evolution considered individual proteins in their entirety or compared protein-coding sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each protein of a given genome. To this en ...[more]

PMID: 26833483

Dataset Information

Different evolutionary patterns of SNPs between domains and unassigned regions in human protein-coding sequences.

Publications

Different evolutionary patterns of SNPs between domains and unassigned regions in human protein-coding sequences.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Different evolutionary patterns between young duplicate genes in the human genome.
| S-EPMC193656 | biostudies-literature

Association Between SNPs of Long Non-coding RNA <i>HOTAIR</i> and Risk of Different Cancers.
| S-EPMC6403183 | biostudies-literature

Evolutionary pressures on simple sequence repeats in prokaryotic coding regions.
| S-EPMC3315296 | biostudies-literature

Proteome-wide discovery of evolutionary conserved sequences in disordered regions.
| S-EPMC4876815 | biostudies-literature

Accurate discrimination of conserved coding and non-coding regions through multiple indicators of evolutionary dynamics.
| S-EPMC2758873 | biostudies-other

Single nucleotide polymorphisms (SNPs) in coding regions of canine dopamine- and serotonin-related genes.
| S-EPMC2268707 | biostudies-literature

Evolutionary rate and gene expression across different brain regions.
| S-EPMC2592720 | biostudies-literature

Novel internal regions of fluorescent proteins undergo divergent evolutionary patterns.
| S-EPMC2775108 | biostudies-literature

The evolutionary rates of HCV estimated with subtype 1a and 1b sequences over the ORF length and in different genomic regions.
| S-EPMC3675120 | biostudies-literature

Depletion of Shine-Dalgarno Sequences Within Bacterial Coding Regions Is Expression Dependent.
| S-EPMC5100845 | biostudies-literature