Genomics

Dataset Information

0

Description of nucleotide and structural variation in short-season soya bean


ABSTRACT: we sequenced a representative set of 102 short-season soya beans and achieved an extensive coverage of both nucleotide diversity and structural variation (SV). We called close to 5M sequence variants (SNPs, MNPs and indels) and noticed that the number of unique haplotypes had plateaued within this set of germplasm (1.7M tag SNPs). This data set proved highly accurate (98.6%) based on a comparison of called genotypes at loci shared with aSNP array. We used this catalogue of SNPs as a reference panel to impute missing genotypes at untyped loci in data sets derived from lower density genotyping tools (150 K GBS-derived SNPs/530 samples). After imputation, 96.4% of the missing genotypes imputed in this fashion proved to be accurate. Using a combination of three bioinformatics pipelines, we uncovered ~92 K SVs (deletions, insertions, inversions, duplications, CNVs and translocations) and estimated that over 90% of these were accurate.

INSTRUMENT(S): Illumina HiSeq 2500

ORGANISM(S): Glycine Max

SUBMITTER: USDA-ARS, CICGRU 

PROVIDER: PRJEB75222 | EVA | 2024-04-25

REPOSITORIES: EVA

Dataset's files

Source:
Action DRS
SNPdata2mod_modified.accessioned.vcf.gz Vcf
SNPdata2mod_modified.accessioned.vcf.gz.csi Vcf
SNPdata2mod_modified.vcf.csi Other
SNPdata2mod_modified.vcf.gz Vcf
Items per page:
1 - 4 of 4

Similar Datasets

2024-08-02 | PXD048367 | Pride
2023-02-07 | GSE170763 | GEO
2016-07-01 | GSE82042 | GEO
2013-06-21 | E-GEOD-48152 | biostudies-arrayexpress
| EGAD00010001640 | EGA
2021-06-30 | GSE167517 | GEO
2016-07-01 | E-GEOD-82042 | biostudies-arrayexpress
2013-06-21 | GSE48152 | GEO
2016-12-30 | GSE83710 | GEO
2016-12-30 | GSE83709 | GEO