Genetic Diversity and Population Structure of a Camelina sativa Spring Panel.
Ontology highlight
ABSTRACT: There is a need to explore renewable alternatives (e.g., biofuels) that can produce energy sources to help reduce the reliance on fossil oils. In addition, the consumption of fossil oils adversely affects the environment and human health via the generation of waste water, greenhouse gases, and waste solids. Camelina sativa, originated from southeastern Europe and southwestern Asia, is being re-embraced as an industrial oilseed crop due to its high seed oil content (36-47%) and high unsaturated fatty acid composition (>90%), which are suitable for jet fuel, biodiesel, high-value lubricants and animal feed. C. sativa's agronomic advantages include short time to maturation, low water and nutrient requirements, adaptability to adverse environmental conditions and resistance to common pests and pathogens. These characteristics make it an ideal crop for sustainable agricultural systems and regions of marginal land. However, the lack of genetic and genomic resources has slowed the enhancement of this emerging oilseed crop and exploration of its full agronomic and breeding potential. Here, a core of 213 spring C. sativa accessions was collected and genotyped. The genotypic data was used to characterize genetic diversity and population structure to infer how natural selection and plant breeding may have affected the formation and differentiation within the C. sativa natural populations, and how the genetic diversity of this species can be used in future breeding efforts. A total of 6,192 high-quality single nucleotide polymorphisms (SNPs) were identified using genotyping-by-sequencing (GBS) technology. The average polymorphism information content (PIC) value of 0.29 indicate moderate genetic diversity for the C. sativa spring panel evaluated in this report. Population structure and principal coordinates analyses (PCoA) based on SNPs revealed two distinct subpopulations. Sub-population 1 (POP1) contains accessions that mainly originated from Germany while the majority of POP2 accessions (>75%) were collected from Eastern Europe. Analysis of molecular variance (AMOVA) identified 4% variance among and 96% variance within subpopulations, indicating a high gene exchange (or low genetic differentiation) between the two subpopulations. These findings provide important information for future allele/gene identification using genome-wide association studies (GWAS) and marker-assisted selection (MAS) to enhance genetic gain in C. sativa breeding programs.
Project description:Camelina sativa (L.) Crantz an oilseed crop of the Brassicaceae family is gaining attention due to its potential as a source of high value oil for food, feed or fuel. The hexaploid domesticated C. sativa has limited genetic diversity, encouraging the exploration of related species for novel allelic variation for traits of interest. The current study utilized genotyping by sequencing to characterize 193 Camelina accessions belonging to seven different species collected primarily from the Ukrainian-Russian region and Eastern Europe. Population analyses among Camelina accessions with a 2n = 40 karyotype identified three subpopulations, two composed of domesticated C. sativa and one of C. microcarpa species. Winter type Camelina lines were identified as admixtures of C. sativa and C. microcarpa Eighteen genotypes of related C. microcarpa unexpectedly shared only two subgenomes with C. sativa, suggesting a novel or cryptic sub-species of C. microcarpa with 19 haploid chromosomes. One C. microcarpa accession (2n = 26) was found to comprise the first two subgenomes of C. sativa suggesting a tetraploid structure. The defined chromosome series among C. microcarpa germplasm, including the newly designated C. neglecta diploid née C. microcarpa, suggested an evolutionary trajectory for the formation of the C. sativa hexaploid genome and re-defined the underlying subgenome structure of the reference genome.
Project description:Sweet sorghum is an attractive feedstock for the production of renewable chemicals and fuels due to the readily available fermentable sugars that can be extracted from the juice, and the additional stream of fermentable sugars that can be obtained from the cell wall polysaccharides in the bagasse. An important selection criterion for new sweet sorghum germplasm is resistance to anthracnose, a disease caused by the fungal pathogen Colletotrichum sublineolum. The identification of novel anthracnose-resistance sources present in sweet sorghum germplasm offers a fast track towards the development of new resistant sweet sorghum germplasm. We established a sweet sorghum diversity panel (SWDP) of 272 accessions from the USDA-ARS National Plant Germplasm (NPGS) collection that includes landraces from 22 countries and advanced breeding material, and that represents ~15% of the NPGS sweet sorghum collection. Genomic characterization of the SWDP identified 171,954 single nucleotide polymorphisms (SNPs) with an average of one SNP per 4,071 kb. Population structure analysis revealed that the SWDP could be stratified into four populations and one admixed group, and that this population structure could be aligned to sorghum's racial classification. Results from a two-year replicated trial of the SWDP for anthracnose resistance response in Texas, Georgia, Florida, and Puerto Rico showed 27 accessions to be resistant across locations, while 145 accessions showed variable resistance response against local pathotypes. A genome-wide association study identified 16 novel genomic regions associated with anthracnose resistance. Four resistance loci on chromosomes 3, 6, 8 and 9 were identified against pathotypes from Puerto Rico, and two resistance loci on chromosomes 3 and 8 against pathotypes from Texas. In Georgia and Florida, three resistance loci were detected on chromosomes 4, 5, 6 and four on chromosomes 4, 5 (two loci) and 7, respectively. One resistance locus on chromosome 2 was effective against pathotypes from Texas and Puerto Rico and a genomic region of 41.6 kb at the tip of chromosome 8 was associated with resistance response observed in Georgia, Texas, and Puerto Rico. This publicly available SWDP and the extensive evaluation of anthracnose resistance represent a valuable genomic resource for the improvement of sorghum.
Project description:BackgroundWheat (Triticum aestivium L.) is an important crop globally which has a complex genome. To identify the parents with useful agronomic characteristics that could be used in the various breeding programs, it is very important to understand the genetic diversity among global wheat genotypes. Also, understanding the genetic diversity is useful in breeding studies such as marker-assisted selection (MAS), genome-wide association studies (GWAS), and genomic selection.ResultsTo understand the genetic diversity in wheat, a set of 103 spring wheat genotypes which represented five different continents were used. These genotypes were genotyped using 36,720 genotyping-by-sequencing derived SNPs (GBS-SNPs) which were well distributed across wheat chromosomes. The tested 103-wheat genotypes contained three different subpopulations based on population structure, principle coordinate, and kinship analyses. A significant variation was found within and among the subpopulations based on the AMOVA. Subpopulation 1 was found to be the more diverse subpopulation based on the different allelic patterns (Na, Ne, I, h, and uh). No high linkage disequilibrium was found between the 36,720 SNPs. However, based on the genomic level, D genome was found to have the highest LD compared with the two other genomes A and B. The ratio between the number of significant LD/number of non-significant LD suggested that chromosomes 2D, 5A, and 7B are the highest LD chromosomes in their genomes with a value of 0.08, 0.07, and 0.05, respectively. Based on the LD decay, the D genome was found to be the lowest genome with the highest number of haplotype blocks on chromosome 2D.ConclusionThe recent study concluded that the 103-spring wheat genotypes and their GBS-SNP markers are very appropriate for GWAS studies and QTL-mapping. The core collection comprises three different subpopulations. Genotypes in subpopulation 1 are the most diverse genotypes and could be used in future breeding programs if they have desired traits. The distribution of LD hotspots across the genome was investigated which provides useful information on the genomic regions that includes interesting genes.
Project description:Landraces are a potential source of genetic diversity and provide useful genetic resources to cope with the current and future challenges in crop breeding. Afghanistan is located close to the centre of origin of hexaploid wheat. Therefore, understanding the population structure and genetic diversity of Afghan wheat landraces is of enormous importance in breeding programmes for the development of high-yielding cultivars as well as broadening the genetic base of bread wheat. Here, a panel of 363 bread wheat landraces collected from seven north and north-eastern provinces of Afghanistan were evaluated for population structure and genetic diversity using single nucleotide polymorphic markers (SNPs). The genotyping-by-sequencing of studied landraces after quality control provided 4897 high-quality SNPs distributed across the genomes A (33.75%), B (38.73%), and D (27.50%). The population structure analysis was carried out by two methods using model-based STRUCTURE analysis and cluster-based discriminant analysis of principal components (DAPC). The analysis of molecular variance showed a higher proportion of variation within the sub-populations compared with the variation observed as a whole between sub-populations. STRUCTURE and DAPC analysis grouped the majority of the landraces from Badakhshan and Takhar together in one cluster and the landraces from Baghlan and Kunduz in a second cluster, which is in accordance with the micro-climatic conditions prevalent within the north-eastern agro-ecological zone. Genetic distance analysis was also studied to identify differences among the Afghan regions; the strongest correlation was observed for the Badakhshan and Takhar (0.003), whereas Samangan and Konarha (0.399) showed the highest genetic distance. The population structure and genetic diversity analysis highlighted the complex genetic variation present in the landraces which were highly correlated to the geographic origin and micro-climatic conditions within the agro-climatic zones of the landraces. The higher proportions of admixture could be attributed to historical unsupervised exchanges of seeds between the farmers of the central and north-eastern provinces of Afghanistan. The results of this study will provide useful information for genetic improvement in wheat and is essential for association mapping and genomic prediction studies to identify novel sources for resistance to abiotic and biotic stresses.
Project description:A common bean (Phaseolus vulgaris) diversity panel of 308 lines was established from local Spanish germplasm, as well as old and elite cultivars mainly used for snap consumption. Most of the landraces included derived from the Spanish common bean core collection, so this panel can be considered to be representative of the Spanish diversity for this species. The panel was characterized by 3099 single-nucleotide polymorphism markers obtained through genotyping-by-sequencing, which revealed a wide genetic diversity and a low level of redundant material within the panel. Structure, cluster, and principal component analyses revealed the presence of two main subpopulations corresponding to the two main gene pools identified in common bean, the Andean and Mesoamerican pools, although most lines (70%) were associated with the Andean gene pool. Lines showing recombination between the two gene pools were also observed, most of them showing useful for snap bean consumption, which suggests that both gene pools were probably used in the breeding of snap bean cultivars. The usefulness of this panel for genome-wide association studies was tested by conducting association mapping for determinacy. Significant marker?trait associations were found on chromosome Pv01, involving the gene Phvul.001G189200, which was identified as a candidate gene for determinacy in the common bean.
Project description:The assessment of genetic diversity and population structure of a core collection would benefit to make use of these germplasm as well as applying them in association mapping. The objective of this study were to (1) examine the population structure of a rice core collection; (2) investigate the genetic diversity within and among subgroups of the rice core collection; (3) identify the extent of linkage disequilibrium (LD) of the rice core collection. A rice core collection consisting of 150 varieties which was established from 2260 varieties of Ting's collection of rice germplasm were genotyped with 274 SSR markers and used in this study. Two distinct subgroups (i.e. SG 1 and SG 2) were detected within the entire population by different statistical methods, which is in accordance with the differentiation of indica and japonica rice. MCLUST analysis might be an alternative method to STRUCTURE for population structure analysis. A percentage of 26% of the total markers could detect the population structure as the whole SSR marker set did with similar precision. Gene diversity and MRD between the two subspecies varied considerably across the genome, which might be used to identify candidate genes for the traits under domestication and artificial selection of indica and japonica rice. The percentage of SSR loci pairs in significant (P<0.05) LD is 46.8% in the entire population and the ratio of linked to unlinked loci pairs in LD is 1.06. Across the entire population as well as the subgroups and sub-subgroups, LD decays with genetic distance, indicating that linkage is one main cause of LD. The results of this study would provide valuable information for association mapping using the rice core collection in future.
Project description:The North-eastern (NE) India, comprising of Arunachal Pradesh, Assam, Manipur, Meghalaya, Mizoram, Nagaland, Sikkim and Tripura, possess diverse array of locally adapted non-Basmati aromatic germplasm. The germplasm collections from this region could serve as valuable resources in breeding for abiotic stress tolerance, grain yield and cooking/eating quality. To utilize such collections, however, breeders need information about the extent and distribution of genetic diversity present within collections. In this study, we report the result of population genetic analysis of 107 aromatic and quality rice accessions collected from different parts of NE India, as well as classified these accessions in the context of a set of structured global rice cultivars. A total of 322 alleles were amplified by 40 simple sequence repeat (SSR) markers with an average of 8.03 alleles per locus. Average gene diversity was 0.67. Population structure analysis revealed that NE Indian aromatic rice can be subdivided into three genetically distinct population clusters: P1, joha rice accessions from Assam, tai rices from Mizoram and those from Sikkim; P2, aromatic rice accessions from Nagaland; and P3, chakhao rice germplasm from Manipur [corrected]. Pair-wise FST between three groups varied from 0.223 (P1 vs P2) to 0.453 (P2 vs P3). With reference to the global classification of rice cultivars, two major groups (Indica and Japonica) were identified in NE Indian germplasm. The aromatic accessions from Assam, Manipur and Sikkim were assigned to the Indica group, while the accessions from Nagaland exhibited close association with Japonica. The tai accessions of Mizoram along with few chakhao accessions collected from the hill districts of Manipur were identified as admixed. The results highlight the importance of regional genetic studies for understanding diversification of aromatic rice in India. The data also suggest that there is scope for exploiting the genetic diversity of aromatic and quality rice germplasm of NE India for rice improvement.
Project description:Camelina sativa is an oilseed with desirable agronomic and oil-quality attributes for a viable industrial oil platform crop. Here we generate the first chromosome-scale high-quality reference genome sequence for C. sativa and annotated 89,418 protein-coding genes, representing a whole-genome triplication event relative to the crucifer model Arabidopsis thaliana. C. sativa represents the first crop species to be sequenced from lineage I of the Brassicaceae. The well-preserved hexaploid genome structure of C. sativa surprisingly mirrors those of economically important amphidiploid Brassica crop species from lineage II as well as wheat and cotton. The three genomes of C. sativa show no evidence of fractionation bias and limited expression-level bias, both characteristics commonly associated with polyploid evolution. The highly undifferentiated polyploid genome of C. sativa presents significant consequences for breeding and genetic manipulation of this industrial oil crop.
Project description:In the last century, breeding programs have traditionally favoured yield-related traits, grown under high-input conditions, resulting in a loss of genetic diversity and an increased susceptibility to stresses in crops. Thus, exploiting understudied genetic resources, that potentially harbour tolerance genes, is vital for sustainable agriculture. Northern European barley germplasm has been relatively understudied despite its key role within the malting industry. The European Heritage Barley collection (ExHIBiT) was assembled to explore the genetic diversity in European barley focusing on Northern European accessions and further address environmental pressures. ExHIBiT consists of 363 spring-barley accessions, focusing on two-row type. The collection consists of landraces (~14%), old cultivars (~18%), elite cultivars (~67%) and accessions with unknown breeding history (~1%), with 70% of the collection from Northern Europe. The population structure of the ExHIBiT collection was subdivided into three main clusters primarily based on the accession's year of release using 26,585 informative SNPs based on 50k iSelect single nucleotide polymorphism (SNP) array data. Power analysis established a representative core collection of 230 genotypically and phenotypically diverse accessions. The effectiveness of this core collection for conducting statistical and association analysis was explored by undertaking genome-wide association studies (GWAS) using 24,876 SNPs for nine phenotypic traits, four of which were associated with SNPs. Genomic regions overlapping with previously characterised flowering genes (HvZTLb) were identified, demonstrating the utility of the ExHIBiT core collection for locating genetic regions that determine important traits. Overall, the ExHIBiT core collection represents the high level of untapped diversity within Northern European barley, providing a powerful resource for researchers and breeders to address future climate scenarios.
Project description:Global environmental change and increasing human population emphasize the urgent need for higher yielding and better adapted crop plants. One strategy to achieve this aim is to exploit the wealth of so called landraces of crop species, representing diverse traditional domesticated populations of locally adapted genotypes. In this study, we investigated a comprehensive set of 1485 spring barley landraces (Lrc1485) adapted to a wide range of climates, which were selected from one of the largest genebanks worldwide. The landraces originated from 5° to 62.5° N and 16° to 71° E. The whole collection was genotyped using 42 SSR markers to assess the genetic diversity and population structure. With an average allelic richness of 5.74 and 372 alleles, Lrc1485 harbours considerably more genetic diversity than the most polymorphic current GWAS panel for barley. Ten major clusters defined most of the population structure based on geographical origin, row type of the ear and caryopsis type - and were assigned to specific climate zones. The legacy core reference set Lrc648 established in this study will provide a long-lasting resource and a very valuable tool for the scientific community. Lrc648 is best suited for multi-environmental field testing to identify candidate genes underlying quantitative traits but also for allele mining approaches.