The number of alleles at a microsatellite defines the allele frequency spectrum and facilitates fast accurate estimation of theta.
Ontology highlight
ABSTRACT: Theoretical work focused on microsatellite variation has produced a number of important results, including the expected distribution of repeat sizes and the expected squared difference in repeat size between two randomly selected samples. However, closed-form expressions for the sampling distribution and frequency spectrum of microsatellite variation have not been identified. Here, we use coalescent simulations of the stepwise mutation model to develop gamma and exponential approximations of the microsatellite allele frequency spectrum, a distribution central to the description of microsatellite variation across the genome. For both approximations, the parameter of biological relevance is the number of alleles at a locus, which we express as a function of ?, the population-scaled mutation rate, based on simulated data. Discovered relationships between ?, the number of alleles, and the frequency spectrum support the development of three new estimators of microsatellite ?. The three estimators exhibit roughly similar mean squared errors (MSEs) and all are biased. However, across a broad range of sample sizes and ? values, the MSEs of these estimators are frequently lower than all other estimators tested. The new estimators are also reasonably robust to mutation that includes step sizes greater than one. Finally, our approximation to the microsatellite allele frequency spectrum provides a null distribution of microsatellite variation. In this context, a preliminary analysis of the effects of demographic change on the frequency spectrum is performed. We suggest that simulations of the microsatellite frequency spectrum under evolutionary scenarios of interest may guide investigators to the use of relevant and sometimes novel summary statistics.
SUBMITTER: Haasl RJ
PROVIDER: S-EPMC3299306 | biostudies-literature | 2010 Dec
REPOSITORIES: biostudies-literature
ACCESS DATA