Optimizing depth and type of high-throughput sequencing data for microsatellite discovery.
Ontology highlight
ABSTRACT: Premise:Simple sequence repeat (SSR) markers (microsatellites) are a mainstay of many labs, especially when working on a limited budget, carrying out preliminary analyses, and in teaching. Whether SSRs mined from plant genomes or transcriptomes are preferred for certain applications, and the depth of sequencing needed to allow efficient SSR discovery, has not been tested. Methods:I used genome and transcriptome high-throughput sequencing data at a range of sequencing depths to compare efficacy of SSR identification. I then tested primers from tomato for amplification, polymorphism, and transferability to related species. Results:Small assemblies (two million read pairs) identified ca. 200-2000 potential markers from the genome assemblies and ca. 600-3650 from the transcriptome assemblies. Genome-derived contigs were often short, potentially precluding primer design. Genomic SSR primers were less transferable across species but exhibited greater variation (partially explained by being composed of more repeat units) than transcriptome-derived primers. Discussion:Small high-throughput sequencing resources may be sufficient for identification of hundreds of SSRs. Genomic data may be preferable in species with low polymorphism, but transcriptome data may result in longer loci (more amenable to primer design) and primers may be more transferable to related species.
SUBMITTER: Chapman MA
PROVIDER: S-EPMC6858294 | biostudies-literature | 2019 Nov
REPOSITORIES: biostudies-literature
ACCESS DATA