Unknown

Dataset Information

0

Inevitability and containment of replication errors for eukaryotic genome lengths spanning megabase to gigabase.


ABSTRACT: The replication of DNA is initiated at particular sites on the genome called replication origins (ROs). Understanding the constraints that regulate the distribution of ROs across different organisms is fundamental for quantifying the degree of replication errors and their downstream consequences. Using a simple probabilistic model, we generate a set of predictions on the extreme sensitivity of error rates to the distribution of ROs, and how this distribution must therefore be tuned for genomes of vastly different sizes. As genome size changes from megabases to gigabases, we predict that regularity of RO spacing is lost, that large gaps between ROs dominate error rates but are heavily constrained by the mean stalling distance of replication forks, and that, for genomes spanning ?100 megabases to ?10 gigabases, errors become increasingly inevitable but their number remains very small (three or less). Our theory predicts that the number of errors becomes significantly higher for genome sizes greater than ?10 gigabases. We test these predictions against datasets in yeast, Arabidopsis, Drosophila, and human, and also through direct experimentation on two different human cell lines. Agreement of theoretical predictions with experiment and datasets is found in all cases, resulting in a picture of great simplicity, whereby the density and positioning of ROs explain the replication error rates for the entire range of eukaryotes for which data are available. The theory highlights three domains of error rates: negligible (yeast), tolerable (metazoan), and high (some plants), with the human genome at the extreme end of the middle domain.

SUBMITTER: Al Mamun M 

PROVIDER: S-EPMC5047159 | biostudies-literature | 2016 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Inevitability and containment of replication errors for eukaryotic genome lengths spanning megabase to gigabase.

Al Mamun Mohammed M   Albergante Luca L   Moreno Alberto A   Carrington James T JT   Blow J Julian JJ   Newman Timothy J TJ  

Proceedings of the National Academy of Sciences of the United States of America 20160914 39


The replication of DNA is initiated at particular sites on the genome called replication origins (ROs). Understanding the constraints that regulate the distribution of ROs across different organisms is fundamental for quantifying the degree of replication errors and their downstream consequences. Using a simple probabilistic model, we generate a set of predictions on the extreme sensitivity of error rates to the distribution of ROs, and how this distribution must therefore be tuned for genomes o  ...[more]

Similar Datasets

| S-EPMC7000699 | biostudies-literature
| S-EPMC2962615 | biostudies-literature
| S-EPMC5167160 | biostudies-literature
| S-EPMC3628276 | biostudies-literature
| S-EPMC2955150 | biostudies-literature
2013-02-28 | E-GEOD-40696 | biostudies-arrayexpress
| S-EPMC8286344 | biostudies-literature
| S-EPMC5368702 | biostudies-literature
2013-02-28 | GSE40696 | GEO
| S-EPMC5631039 | biostudies-literature