Project description:Bioinformatics, a discipline at the crossroads of Biology and Computational Sciences, also referred to as Computational Biology, is nowadays widely spread in research programs. However, implementing any Bioinformatics projects requires the ability to comprehend biological concepts and apply computational approaches, and rare are the undergraduate programs offering such multi-disciplinary training. In addition, understanding the dynamic between Biology research projects and Bioinformatics analyses is challenging with no real-life experience. Course-based undergraduate research experience (CURE) courses are innovative programs that allow more students to acquire research experience and provide the perfect setting to introduce students to applied bioinformatics. As a part of the Bachelor of Health Sciences of the Cumming School of Medicine at the University of Calgary (Canada), a CURE applied bioinformatics was implemented in the Winter of 2023 to 2025. Students investigated the effect of structural variants (SVs, genetic variants larger than 50 bp) on gene expression in the model organism Caenorhabditis elegans (a hermaphrodite 1-mm long roundworm). The students detected and characterized SVs by analyzing genome and transcriptome sequencing data of C. elegans strains called balancers, as they are known to carry large genomic variations balancing regions of the genome by limiting recombination and allowing maintenance of lethal mutations. They used Galaxy, a public web-based supercomputing resource, but also a local High-Performance computing system, and R, to report different effects of SVs on gene expression and splicing. Students’ research explained the molecular mechanism behind the uncoordinated phenotype caused by the reciprocal translocation eT1(III;V) and uncovered unexpected effects on gene expression on an understudied gene. We evaluated the course’s impact on student learning journeys and showed that the CURE favored students’ understanding of the Bioinformatics field and fostered their research interest. We provide here guidelines to facilitate the CURE implementations to improve access for undergraduate students to bioinformatics research experiences.

Project description:16S rRNA gene sequences are commonly analyzed for taxonomic and phylogenetic studies because they contain variable regions that can help distinguish different genera. However, intra-genus distinction using variable region homology is often impossible due to the high overall sequence identities among closely related species, even though some residues may be conserved within respective species. Using a computational method that included the allelic diversity within individual genomes, we discovered that certain Escherichia and Shigella species can be distinguished by a multi-allelic 16S rRNA variable region single nucleotide polymorphism (SNP). To evaluate the performance of 16S rRNAs with altered variable regions, we developed an in vivo system that measures the acceptance and distribution of variant 16S rRNAs into a large pool of natural versions supporting normal translation and growth. We found that 16S rRNAs containing evolutionarily disparate variable regions were underpopulated both in ribosomes and in active translation pools, even for an SNP. Overall, this study revealed that variable region sequences can substantially influence the performance of 16S rRNAs and that this biological constraint can be leveraged to justify refining taxonomic assignments of variable region sequence data. IMPORTANCE This study reevaluates the notion that 16S rRNA gene variable region sequences are uninformative for intra-genus classification and that single nucleotide variations within them have no consequence to strains that bear them. We demonstrated that the performance of 16S rRNAs in Escherichia coli can be negatively impacted by sequence changes in variable regions, even for single nucleotide changes that are native to closely related Escherichia and Shigella species; thus, biological performance is likely constraining the evolution of variable regions in bacteria. Further, the native nucleotide variations we tested occur in all strains of their respective species and across their multiple 16S rRNA gene copies, suggesting that these species evolved beyond what would be discerned from a consensus sequence comparison. Therefore, this work also reveals that the multiple 16S rRNA gene alleles found in most bacteria can provide more informative phylogenetic and taxonomic detail than a single reference allele.

Project description:BackgroundThe phylogeny of the genus Methanobrevibacter was established almost 25 years ago on the basis of the similarities of the 16S rRNA oligonucleotide catalogs. Since then, many 16S rRNA gene sequences of newly isolated strains or clones representing the genus Methanobrevibacter have been deposited. We tried to reorganize the 16S rRNA gene sequences of this genus and revise the taxonomic affiliation of the isolates and clones representing the genus Methanobrevibacter.ResultsThe phylogenetic analysis of the genus based on 786 bp aligned region from fifty-four representative sequences of the 120 available sequences for the genus revealed seven multi-member groups namely, Ruminantium, Smithii, Woesei, Curvatus, Arboriphilicus, Filiformis, and the Termite gut symbionts along with three separate lineages represented by Mbr. wolinii, Mbr. acididurans, and termite gut flagellate symbiont LHD12. The cophenetic correlation coefficient, a test for the ultrametric properties of the 16S rRNA gene sequences used for the tree was found to be 0.913 indicating the high degree of goodness of fit of the tree topology. A significant relationship was found between the 16S rRNA sequence similarity (S) and the extent of DNA hybridization (D) for the genus with the correlation coefficient (r) for logD and logS, and for [ln(-lnD) and ln(-lnS)] being 0.73 and 0.796 respectively. Our analysis revealed that for this genus, when S = 0.984, D would be <70% at least 99% of the times, and with 70% D as the species "cutoff", any 16S rRNA gene sequence showing <98% sequence similarity can be considered as a separate species. In addition, we deduced group specific signature positions that have remained conserved in evolution of the genus.ConclusionsA very significant relationship between D and S was found to exist for the genus Methanobrevibacter, implying that it is possible to predict D from S with a known precision for the genus. We propose to include the termite gut flagellate symbiont LHD12, the methanogenic endosymbionts of the ciliate Nyctotherus ovalis, and rat feces isolate RT reported earlier, as separate species of the genus Methanobrevibacter.

Dataset Information

16S rRNA sequences

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets