Project description:Background and aimsPhylogenetic relationships within tribe Shoreeae, containing the main elements of tropical forests in Southeast Asia, present a long-standing problem in the systematics of Dipterocarpaceae. Sequencing whole plastomes using next-generation sequencing- (NGS) based genome skimming is increasingly employed for investigating phylogenetic relationships of plants. Here, the usefulness of complete plastid genome sequences in resolving phylogenetic relationships within Shoreeae is evaluated.MethodsA pipeline to obtain alignments of whole plastid genome sequences across individuals with different amounts of available data is presented. In total, 48 individuals, representing 37 species and four genera of the ecologically and economically important tribe Shoreeae sensu Ashton, were investigated. Phylogenetic trees were reconstructed using maximum parsimony, maximum likelihood and Bayesian inference.Key resultsHere, the first fully sequenced plastid genomes for the tribe Shoreeae are presented. Their size, GC content and gene order are comparable with those of other members of Malvales. Phylogenomic analyses demonstrate that whole plastid genomes are useful for inferring phylogenetic relationships among genera and groups of Shorea (Shoreeae) but fail to provide well-supported phylogenetic relationships among some of the most closely related species. Discordance in placement of Parashorea was observed between phylogenetic trees obtained from plastome analyses and those obtained from nuclear single nucleotide polymorphism (SNP) data sets identified in restriction-site associated sequencing (RADseq).ConclusionsPhylogenomic analyses of the entire plastid genomes are useful for inferring phylogenetic relationships at lower taxonomic levels, but are not sufficient for detailed phylogenetic reconstructions of closely related species groups in Shoreeae. Discordance in placement of Parashorea was further investigated for evidence of ancient hybridization.
Project description:Isolated populations have unique population genetics characteristics that can help boost power in genetic association studies for complex traits. Leveraging these advantageous characteristics requires an in-depth understanding of parameters that have shaped sequence variation in isolates. This study performs a comprehensive investigation of these parameters using low-depth whole genome sequencing (WGS) across multiple isolates.
Project description:BackgroundTribe Cinnamomeae is a species-rich and ecologically important group in tropical and subtropical forests. Previous studies explored its phylogenetic relationships and historical biogeography using limited loci, which might result in biased molecular dating due to insufficient parsimony-informative sites. Thus, 15 plastomes were newly sequenced and combined with published plastomes to study plastome structural variations, gene evolution, phylogenetic relationships, and divergence times of this tribe.ResultsAmong the 15 newly generated plastomes, 14 ranged from 152,551 bp to 152,847 bp, and the remaining one (Cinnamomum chartophyllum XTBGLQM0164) was 158,657 bp. The inverted repeat (IR) regions of XTBGLQM0164 contained complete ycf2, trnICAU, rpl32, and rpl2. Four hypervariable plastid loci (ycf1, ycf2, ndhF-rpl32-trnLUAG, and petA-psbJ) were identified as candidate DNA barcodes. Divergence times based on a few loci were primarily determined by prior age constraints rather than by DNA data. In contrast, molecular dating using complete plastid protein-coding genes (PCGs) was determined by DNA data rather than by prior age constraints. Dating analyses using PCGs showed that Cinnamomum sect. Camphora diverged from C. sect. Cinnamomum in the late Oligocene (27.47 Ma).ConclusionsThis study reports the first case of drastic IR expansion in tribe Cinnamomeae, and indicates that plastomes have sufficient parsimony-informative sites for molecular dating. Besides, the dating analyses provide preliminary insights into the divergence time within tribe Cinnamomeae and can facilitate future studies on its historical biogeography.
Project description:MotivationVery low-depth sequencing has been proposed as a cost-effective approach to capture low-frequency and rare variation in complex trait association studies. However, a full characterization of the genotype quality and association power for very low-depth sequencing designs is still lacking.ResultsWe perform cohort-wide whole-genome sequencing (WGS) at low depth in 1239 individuals (990 at 1× depth and 249 at 4× depth) from an isolated population, and establish a robust pipeline for calling and imputing very low-depth WGS genotypes from standard bioinformatics tools. Using genotyping chip, whole-exome sequencing (75× depth) and high-depth (22×) WGS data in the same samples, we examine in detail the sensitivity of this approach, and show that imputed 1× WGS recapitulates 95.2% of variants found by imputed GWAS with an average minor allele concordance of 97% for common and low-frequency variants. In our study, 1× further allowed the discovery of 140 844 true low-frequency variants with 73% genotype concordance when compared to high-depth WGS data. Finally, using association results for 57 quantitative traits, we show that very low-depth WGS is an efficient alternative to imputed GWAS chip designs, allowing the discovery of up to twice as many true association signals than the classical imputed GWAS design.Availability and implementationThe HELIC genotype and WGS datasets have been deposited to the European Genome-phenome Archive (https://www.ebi.ac.uk/ega/home): EGAD00010000518; EGAD00010000522; EGAD00010000610; EGAD00001001636, EGAD00001001637. The peakplotter software is available at https://github.com/wtsi-team144/peakplotter, the transformPhenotype app can be downloaded at https://github.com/wtsi-team144/transformPhenotype.Supplementary informationSupplementary data are available at Bioinformatics online.