Dataset Information

Selecting RAD-Seq Data Analysis Parameters for Population Genetics: The More the Better?

ABSTRACT: Restriction site-associated DNA sequencing (RAD-seq) has become a powerful and widely used tool in molecular ecology studies as it allows to cost-effectively recover thousands of polymorphic sites across individuals of non-model organisms. However, its successful implementation in population genetics relies on correct data processing that would minimize potential loci-assembly biases and consequent genotyping error rates. RAD-seq data processing when no reference genome is available involves the assembly of hundreds of thousands high-throughput sequencing reads into orthologous loci, for which various key parameter values need to be selected by the researcher. Previous studies exploring the effect of these parameter values found or assumed that a larger number of recovered polymorphic loci is associated with a better assembly. Here, using three RAD-seq datasets from different species, we explore the effect of read filtering, loci assembly and polymorphic site selection on number of markers obtained and genetic differentiation inferred using the Stacks software. We find (i) that recovery of higher numbers of polymorphic loci is not necessarily associated with higher genetic differentiation, (ii) that the presence of PCR duplicates, selected loci assembly parameters and selected SNP filtering parameters affect the number of recovered polymorphic loci and degree of genetic differentiation, and (iii) that this effect is different in each dataset, meaning that defining a systematic universal protocol for RAD-seq data analysis may lead to missing relevant information about population differentiation.

SUBMITTER: Diaz-Arce N

PROVIDER: S-EPMC6549478 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Selecting RAD-Seq Data Analysis Parameters for Population Genetics: The More the Better?

Díaz-Arce Natalia N Rodríguez-Ezpeleta Naiara N

Frontiers in genetics 20190529

Restriction site-associated DNA sequencing (RAD-seq) has become a powerful and widely used tool in molecular ecology studies as it allows to cost-effectively recover thousands of polymorphic sites across individuals of non-model organisms. However, its successful implementation in population genetics relies on correct data processing that would minimize potential loci-assembly biases and consequent genotyping error rates. RAD-seq data processing when no reference genome is available involves the ...[more]

PMID: 31191624

Dataset Information

Selecting RAD-Seq Data Analysis Parameters for Population Genetics: The More the Better?

Publications

Selecting RAD-Seq Data Analysis Parameters for Population Genetics: The More the Better?

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

RADIS: analysis of RAD-seq data for interspecific phylogeny.
| S-EPMC5039923 | biostudies-literature

Population genetics analysis of the Nujiang catfish Creteuchiloglanis macropterus through a genome-wide single nucleotide polymorphisms resource generated by RAD-seq.
| S-EPMC5460224 | biostudies-literature

DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics.
| S-EPMC7293188 | biostudies-literature

Inferring population genetics parameters of evolving viruses using time-series data.
| S-EPMC6555871 | biostudies-literature

PMERGE: Computational filtering of paralogous sequences from RAD-seq data.
| S-EPMC6065343 | biostudies-literature

Population-specific genetic variation in large sequencing data sets: why more data is still better.
| S-EPMC5602011 | biostudies-literature

Genet assignment and population structure analysis in a clonal forest-floor herb, Cardamine leucantha, using RAD-seq.
| S-EPMC6983914 | biostudies-literature

ddSeeker: a tool for processing Bio-Rad ddSEQ single cell RNA-seq data.
| S-EPMC6304778 | biostudies-other

TagDigger: user-friendly extraction of read counts from GBS and RAD-seq data.
| S-EPMC4940913 | biostudies-other

Genomic sequence diversity and population structure of Saccharomyces cerevisiae assessed by RAD-seq.
| S-EPMC3852379 | biostudies-literature