Dataset Information

Generating High Density, Low Cost Genotype Data in Soybean [Glycine max (L.) Merr.].

ABSTRACT: Obtaining genome-wide genotype information for millions of SNPs in soybean [Glycine max (L.) Merr.] often involves completely resequencing a line at 5X or greater coverage. Currently, hundreds of soybean lines have been resequenced at high depth levels with their data deposited in the NCBI Short Read Archive. This publicly available dataset may be leveraged as an imputation reference panel in combination with skim (low coverage) sequencing of new soybean genotypes to economically obtain high-density SNP information. Ninety-nine soybean lines resequenced at an average of 17.1X were used to generate a reference panel, with over 10 million SNPs called using GATK's Haplotype Caller tool. Whole genome resequencing at approximately 1X depth was performed on 114 previously ungenotyped experimental soybean lines. Coverages down to 0.1X were analyzed by randomly subsetting raw reads from the original 1X sequence data. SNPs discovered in the reference panel were genotyped in the experimental lines after aligning to the soybean reference genome, and missing markers imputed using Beagle 4.1. Sequencing depth of the experimental lines could be reduced to 0.3X while still retaining an accuracy of 97.8%. Accuracy was inversely related to minor allele frequency, and highly correlated with marker linkage disequilibrium. The high accuracy of skim sequencing combined with imputation provides a low cost method for obtaining dense genotypic information that can be used for various genomics applications in soybean.

SUBMITTER: Happ MM

PROVIDER: S-EPMC6643887 | biostudies-literature | 2019 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Generating High Density, Low Cost Genotype Data in Soybean [<i>Glycine max</i> (L.) Merr.].

Happ Mary M MM Wang Haichuan H Graef George L GL Hyten David L DL

G3 (Bethesda, Md.) 20190709 7

Obtaining genome-wide genotype information for millions of SNPs in soybean [<i>Glycine max</i> (L.) Merr.] often involves completely resequencing a line at 5X or greater coverage. Currently, hundreds of soybean lines have been resequenced at high depth levels with their data deposited in the NCBI Short Read Archive. This publicly available dataset may be leveraged as an imputation reference panel in combination with skim (low coverage) sequencing of new soybean genotypes to economically obtain h ...[more]

PMID: 31072870

Similar Datasets

Project description:Soybean (Glycine max) seed yields rely on the efficiency of photosynthesis, which is poorly understood in soybean. Chlorophyll, the major light harvesting pigment, is crucial for chloroplast biogenesis and photosynthesis. Magnesium chelatase catalyzes the insertion of Mg2+ into protoporphyrin IX in the first committed and key regulatory step of chlorophyll biosynthesis. It consists of three types of subunits, ChlI, ChlD, and ChlH. To gain a better knowledge of chlorophyll biosynthesis in soybean, we analyzed soybean Mg-chelatase subunits and their encoding genes. Soybean genome harbors 4 GmChlI genes, 2 GmChlD genes, and 3 GmChlH genes, likely evolved from two rounds of gene duplication events. The qRT-PCR analysis revealed that GmChlI, GmChlD, and GmChlH genes predominantly expressed in photosynthetic tissues, but the expression levels among paralogs are different. In silicon promoter analyses revealed these genes harbor different cis-regulatory elements in their promoter regions, suggesting they could differentially respond to various environmental and developmental signals. Subcellular localization analyses illustrated that GmChlI, GmChlD, and GmChlH isoforms are all localized in chloroplast, consistent with their functions. Yeast two hybrid and bimolecular fluorescence complementation (BiFC) assays showed each isoform has a potential to be assembled into the Mg-chelatase holocomplex. We expressed each GmChlI, GmChlD, and GmChlH isoform in Arabidopsis corresponding mutants, and results showed that 4 GmChlI and 2 GmChlD isoforms and GmChlH1 could rescue the severe phenotype of Arabidopsis mutants, indicating that they maintain normal biochemical functions in vivo. However, GmChlH2 and GmChlH3 could not completely rescue the chlorotic phenotype of Arabidopsis gun5-2 mutant, suggesting that the functions of these two proteins could be different from GmChlH1. Considering the differences shown on primary sequences, biochemical functions, and gene expression profiles, we conclude that the paralogs of each soybean Mg-chelatase subunit have diverged more or less during evolution. Soybean could have developed a complex regulatory mechanism to control chlorophyll content to adapt to different developmental and environmental situations.

Dataset Information

Generating High Density, Low Cost Genotype Data in Soybean [Glycine max (L.) Merr.].

Publications

Generating High Density, Low Cost Genotype Data in Soybean [<i>Glycine max</i> (L.) Merr.].

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets