Unknown

Dataset Information

0

Long-range bidirectional strand asymmetries originate at CpG islands in the human genome.


ABSTRACT: In the human genome, CpG islands (CGIs), which are GC- and CpG-rich sequences, are associated with transcription starting sites (TSSs); in addition, there is evidence that CGIs harbor origins of bidirectional replication (OBRs) and are preferred sites for heteroduplex formation during recombination. Transcription, replication, and recombination processes are known to induce specific mutational patterns in various genomes, and therefore, these patterns are expected to be found around CGIs. We use triple alignments of human, chimp, and macaque to compute the rates of nucleotide substitutions in up to 1 Mbps long intergenic regions on both sides of CGIs. Our analysis revealed that around a CGI there is an asymmetry between complementary substitution rates that is similar to the one that found around the OBR in bacteria. We hypothesize that these asymmetries are induced by differences in the replication of the leading and lagging strand and that a significant number of CGIs overlap OBRs. Within CGIs, we observed a mutational signature of GC-biased gene conversion that is associated with recombination. We suggest that recombination has played a major role in the creation of CGIs.

SUBMITTER: Polak P 

PROVIDER: S-EPMC2817419 | biostudies-literature | 2009 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Long-range bidirectional strand asymmetries originate at CpG islands in the human genome.

Polak Paz P   Arndt Peter F PF  

Genome biology and evolution 20090803


In the human genome, CpG islands (CGIs), which are GC- and CpG-rich sequences, are associated with transcription starting sites (TSSs); in addition, there is evidence that CGIs harbor origins of bidirectional replication (OBRs) and are preferred sites for heteroduplex formation during recombination. Transcription, replication, and recombination processes are known to induce specific mutational patterns in various genomes, and therefore, these patterns are expected to be found around CGIs. We use  ...[more]

Similar Datasets

| S-EPMC3256200 | biostudies-literature
| S-EPMC2817693 | biostudies-literature
| S-EPMC2926781 | biostudies-literature
| S-EPMC2652441 | biostudies-literature
| S-EPMC3125183 | biostudies-literature
| S-EPMC187522 | biostudies-literature
| S-EPMC2944787 | biostudies-literature
| S-EPMC3149076 | biostudies-literature
| S-EPMC4121359 | biostudies-literature
| S-EPMC310670 | biostudies-literature