Project description:Whole-genome sequencing (WGS) with next-generation DNA sequencing (NGS) is an increasingly accessible and affordable method for genotyping hundreds of Mycobacterium tuberculosis (Mtb) isolates, leading to more effective epidemiological studies involving single nucleotide variations (SNVs) in core genomic sequences based on molecular evolution. We developed an all-in-one web-based tool for genotyping Mtb, referred to as the Total Genotyping Solution for TB (TGS-TB), to facilitate multiple genotyping platforms using NGS for spoligotyping and the detection of phylogenies with core genomic SNVs, IS6110 insertion sites, and 43 customized loci for variable number tandem repeat (VNTR) through a user-friendly, simple click interface. This methodology is implemented with a KvarQ script to predict MTBC lineages/sublineages and potential antimicrobial resistance. Seven Mtb isolates (JP01 to JP07) in this study showing the same VNTR profile were accurately discriminated through median-joining network analysis using SNVs unique to those isolates. An additional IS6110 insertion was detected in one of those isolates as supportive genetic information in addition to core genomic SNVs. The results of in silico analyses using TGS-TB are consistent with those obtained using conventional molecular genotyping methods, suggesting that NGS short reads could provide multiple genotypes to discriminate multiple strains of Mtb, although longer NGS reads (≥ 300-mer) will be required for full genotyping on the TGS-TB web site. Most available short reads (~100-mer) can be utilized to discriminate the isolates based on the core genome phylogeny. TGS-TB provides a more accurate and discriminative strain typing for clinical and epidemiological investigations; NGS strain typing offers a total genotyping solution for Mtb outbreak and surveillance. TGS-TB web site:
Project description:A long-standing challenge in human microbiome research is achieving the taxonomic and functional resolution needed to generate testable hypotheses about the gut microbiota's impact on health and disease. With a growing number of live microbial interventions in clinical development, this challenge is renewed by a need to understand the pharmacokinetics and pharmacodynamics of therapeutic candidates. While short-read sequencing of the bacterial 16S rRNA gene has been the standard for microbiota profiling, recent improvements in the fidelity of long-read sequencing underscores the need for a re-evaluation of the value of distinct microbiome-sequencing approaches. We leveraged samples from participants enrolled in a phase 1b clinical trial of a novel live biotherapeutic product to perform a comparative analysis of short-read and long-read amplicon and metagenomic sequencing approaches to assess their utility for generating clinical microbiome data. Across all methods, overall community taxonomic profiles were comparable and relationships between samples were conserved. Comparison of ubiquitous short-read 16S rRNA amplicon profiling to long-read profiling of the 16S-ITS-23S rRNA amplicon showed that only the latter provided strain-level community resolution and insight into novel taxa. All methods identified an active ingredient strain in treated study participants, though detection confidence was higher for long-read methods. Read coverage from both metagenomic methods provided evidence of active-ingredient strain replication in some treated participants. Compared to short-read metagenomics, approximately twice the proportion of long reads were assigned functional annotations. Finally, compositionally similar bacterial metagenome-assembled genomes (MAGs) were recovered from short-read and long-read metagenomic methods, although a greater number and more complete MAGs were recovered from long reads. Despite higher costs, both amplicon and metagenomic long-read approaches yielded added microbiome data value in the form of higher confidence taxonomic and functional resolution and improved recovery of microbial genomes compared to traditional short-read methodologies.
Project description:Animal tuberculosis is a significant infectious disease affecting both livestock and wildlife populations worldwide. Effective disease surveillance and characterization of Mycobacterium bovis (M. bovis) strains are essential for understanding transmission dynamics and implementing control measures. Currently, sequencing of genomic information has relied on culture-based methods, which are time-consuming, resource-demanding, and concerning in terms of biosafety. This study explores the use of culture-independent long-read whole-genome sequencing (WGS) for a better understanding of M. bovis epidemiology in African buffaloes (Syncerus caffer). By comparing two sequencing approaches, we evaluated the efficacy of Illumina WGS performed on culture extracts and culture-independent Oxford Nanopore adaptive sampling (NAS). Our objective was to assess the potential of NAS to detect genomic variants without sample culture. In addition, culture-independent amplicon sequencing, targeting mycobacterial-specific housekeeping and full-length 16S rRNA genes, was applied to investigate the presence of microorganisms, including nontuberculous mycobacteria. The sequencing quality obtained from DNA extracted directly from tissues using NAS is comparable to the sequencing quality of reads generated from culture-derived DNA using both NAS and Illumina technologies. We present a new approach that provides complete and accurate genome sequence reconstruction, culture independently, and using an economically affordable technique.
Project description:Next-generation DNA sequencing has revolutionized the study of biology. However, the short read lengths of the dominant instruments complicate assembly of complex genomes and haplotype phasing of mixtures of similar sequences. Here we demonstrate a method to reconstruct the sequences of individual nucleic acid molecules up to 11.6 kilobases in length from short (150-bp) reads. We show that our method can construct 99.97%-accurate synthetic reads from bacterial, plant, and animal genomic samples, full-length mRNA sequences from human cancer cell lines, and individual HIV env gene variants from a mixture. The preparation of multiple samples can be multiplexed into a single tube, further reducing effort and cost relative to competing approaches. Our approach generates sequencing libraries in three days from less than one microgram of DNA in a single-tube format without custom equipment or specialized expertise.
Project description:Cambodia has one of the highest tuberculosis (TB) incidence rates in the WHO Western Pacific region. Remarkably though, the prevalence of multidrug-resistant TB (MDR-TB) remains low. We explored the genetic diversity of Mycobacterium tuberculosis (MTB) circulating in this unique setting using whole-genome sequencing (WGS). From October 2017 until January 2018, we collected one hundred sputum specimens from consenting adults older than 21 years of age, newly diagnosed with bacteriologically confirmed TB in 3 districts of Phnom Penh and Takeo provinces of Cambodia before they commence on their TB treatment, where eighty MTB isolates were successfully cultured and sequenced. Majority of the isolates belonged to Lineage 1 (Indo-Oceanic) (69/80, 86.25%), followed by Lineage 2 (East Asian) (10/80, 12.5%) and Lineage 4 (Euro-American) (1/80, 1.25%). Phenotypic resistance to both streptomycin and isoniazid was found in 3 isolates (3/80, 3.75%), while mono-resistance to streptomycin and isoniazid was identical at 2.5% (N = 2 each). None of the isolates tested was resistant to either rifampicin or ethambutol. The specificities of genotypic prediction for resistance to all drugs tested were 100%, while the sensitivities of genotypic resistance predictions to isoniazid and streptomycin were lower at 40% (2/5) and 80% (4/5) respectively. We identified 8 clusters each comprising of two to five individuals all residing in the Takeo province, making up half (28/56, 50%) of all individuals sampled in the province, indicating the presence of multiple ongoing transmission events. All clustered isolates were of Lineage 1 and none are resistant to any of the drugs tested. This study while demonstrating the relevance and utility of WGS in predicting drug resistance and inference of disease transmission, highlights the need to increase the representation of genotype-phenotype TB data from low and middle income countries in Asia and Africa to improve the accuracies for prediction of drug resistance.
Project description:BackgroundStreptococcus suis is divided into 29 serotypes based on a serological reaction against the capsular polysaccharide (CPS). Multiplex PCR tests targeting the cps locus are also used to determine S. suis serotypes, but they cannot differentiate between serotypes 1 and 14, and between serotypes 2 and 1/2. Here, we developed a pipeline permitting in silico serotype determination from whole-genome sequencing (WGS) short-read data that can readily identify all 29 S. suis serotypes.ResultsWe sequenced the genomes of 121 strains representing all 29 known S. suis serotypes. We next combined available software into an automated pipeline permitting in silico serotyping of strains by differential alignment of short-read sequencing data to a custom S. suis cps loci database. Strains of serotype pairs 1 and 14, and 2 and 1/2 could be differentiated by a missense mutation in the cpsK gene. We report a 99 % match between coagglutination- and pipeline-determined serotypes for strains in our collection. We used 375 additional S. suis genomes downloaded from the NCBI's Sequence Read Archive (SRA) to validate the pipeline. Validation with SRA WGS data resulted in a 92 % match. Included pipeline subroutines permitted us to assess strain virulence marker content and obtain multilocus sequence typing directly from WGS data.ConclusionsOur pipeline permits rapid and accurate determination of S. suis serotype, and other lineage information, directly from WGS data. By discriminating between serotypes 1 and 14, and between serotypes 2 and 1/2, our approach solves a three-decade longstanding S. suis typing issue.
Project description:Application of Oxford Nanopore Technologies' long-read sequencing platform to transcriptomic analysis is increasing in popularity. However, such analysis can be challenging due to the high sequence error and small library sizes, which decreases quantification accuracy and reduces power for statistical testing. Here, we report the analysis of two nanopore RNA-seq datasets with the goal of obtaining gene- and isoform-level differential expression information. A dataset of synthetic, spliced, spike-in RNAs ('sequins') as well as a mouse neural stem cell dataset from samples with a null mutation of the epigenetic regulator Smchd1 was analysed using a mix of long-read specific tools for preprocessing together with established short-read RNA-seq methods for downstream analysis. We used limma-voom to perform differential gene expression analysis, and the novel FLAMES pipeline to perform isoform identification and quantification, followed by DRIMSeq and limma-diffSplice (with stageR) to perform differential transcript usage analysis. We compared results from the sequins dataset to the ground truth, and results of the mouse dataset to a previous short-read study on equivalent samples. Overall, our work shows that transcriptomic analysis of long-read nanopore data using long-read specific preprocessing methods together with short-read differential expression methods and software that are already in wide use can yield meaningful results.
Project description:Evaluation of short-read-only, long-read-only, and hybrid assembly approaches on metagenomic samples demonstrating how they affect gene and protein prediction which is relevant for downstream functional analyses. For a human gut microbiome sample, we use complementary metatranscriptomic, and metaproteomic data to evaluate the metagenomic-based protein predictions.
Project description:Mycobacterium tuberculosis is a contagious agent that causes tuberculosis. A specific type (called the K cluster) of M. tuberculosis with 10 copies of IS6110 in restriction fragment length polymorphism (RFLP) has been found in about 4% of M. tuberculosis isolates in Korea. Here, we report the complete genome sequence of M. tuberculosis Korean strain KIT87190 belonging to the K cluster.
Project description:Mycobacterium tuberculosis is known to cause pulmonary and extrapulmonary tuberculosis. In Morocco, the spread of multidrug-resistant (MDR) tuberculosis (TB) has become a major challenge. Here, we announce the draft genome sequences of two Mycobacterium tuberculosis strains, MTB1 and MTB2, isolated from patients with pulmonary tuberculosis in Morocco, to describe variants associated with drug resistance.