Unknown

Dataset Information

0

SNP and indel frequencies at transcription start sites and at canonical and alternative translation initiation sites in the human genome.


ABSTRACT: Single-nucleotide polymorphisms (SNPs) are the most common form of genetic variation in humans and drive phenotypic variation. Due to evolutionary conservation, SNPs and indels (insertion and deletions) are depleted in functionally important sequence elements. Recently, population-scale sequencing efforts such as the 1000 Genomes Project and the Genome of the Netherlands Project have catalogued large numbers of sequence variants. Here, we present a systematic analysis of the polymorphisms reported by these two projects in different coding and non-coding genomic elements of the human genome (intergenic regions, CpG islands, promoters, 5' UTRs, coding exons, 3' UTRs, introns, and intragenic regions). Furthermore, we were especially interested in the distribution of SNPs and indels in direct vicinity to the transcription start site (TSS) and translation start site (CSS). Thereby, we discovered an enrichment of dinucleotides CpG and CpA and an accumulation of SNPs at base position -1 relative to the TSS that involved primarily CpG and CpA dinucleotides. Genes having a CpG dinucleotide at TSS position -1 were enriched in the functional GO terms "Phosphoprotein", "Alternative splicing", and "Protein binding". Focusing on the CSS, we compared SNP patterns in the flanking regions of canonical and alternative AUG and near-cognate start sites where we considered alternative starts previously identified by experimental ribosome profiling. We observed similar conservation patterns of canonical and alternative translation start sites, which underlines the importance of alternative translation mechanisms for cellular function.

SUBMITTER: Neininger K 

PROVIDER: S-EPMC6461226 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

SNP and indel frequencies at transcription start sites and at canonical and alternative translation initiation sites in the human genome.

Neininger Kerstin K   Marschall Tobias T   Helms Volkhard V  

PloS one 20190412 4


Single-nucleotide polymorphisms (SNPs) are the most common form of genetic variation in humans and drive phenotypic variation. Due to evolutionary conservation, SNPs and indels (insertion and deletions) are depleted in functionally important sequence elements. Recently, population-scale sequencing efforts such as the 1000 Genomes Project and the Genome of the Netherlands Project have catalogued large numbers of sequence variants. Here, we present a systematic analysis of the polymorphisms report  ...[more]

Similar Datasets

| S-EPMC6954239 | biostudies-literature
| S-EPMC1448210 | biostudies-literature
| S-EPMC5287289 | biostudies-literature
| S-EPMC3898931 | biostudies-literature
| S-EPMC3025576 | biostudies-literature
| S-EPMC5934623 | biostudies-literature
| S-EPMC6813887 | biostudies-literature
| S-EPMC5341516 | biostudies-literature
| S-EPMC4965872 | biostudies-literature
| S-EPMC4907116 | biostudies-literature