Project description:To investigate the gene regulatory mechanisms driving T cell development, we generated single-cell transcriptomics and chromatin accessibility data from a human fetal thymus sample at 10 weeks of gestation.
Project description:Bufotoxin is an endogenous toxin made up of several physiologically active components that toads deploy as a defense against their natural enemies. Bufadienolides (BDS), which is isolated from bufotoxin, is an important anticancer drug, and other components such as bufotenine and alkaloids are also important drug resources. The distribution characteristics and biosynthesis of bufotoxins in the postauricular glands (PGs) of toads are not well understood. We examined the toad's PGs using the MADLI/MSI technique, a total of 1,872 components were found, and some pharmacological components were visible. These findings indicate that bufotoxins are primarily abundant in the plasma glands (pG) and epidermal tissues of the glands. By using single-cell sequencing, it was possible to create a single-cell atlas of 9316 PGs cells. These cells were then categorized into nine clusters using marker genes, and two types of epithelial cells were verified using in situ hybridization investigations. It was confirmed that cholesterol is a precursor component of BDS biosynthesis, we concentrated on the cholesterol metabolism component and postulated the primary bile acid pathway as a downstream biosynthesis pathway of BDS through transcriptomic studies of two pG and mucous glands (MG) with distinct secretory functions. Optimal and silenced genes for potential BDS synthesis pathways, toad toxin tryptamine and alkaloid biosynthesis, terpene skeleton and steroid hormones were identified by calculating the cellular coverage of genes. Our data demonstrate the metabolic mapping of bufotoxins in the PGs of the toad, and create the first single-cell atlas of PGs in the toad, providing a reference for the study of biosynthesis of natural active ingredients in animals.
Project description:Age prediction based on single cell RNA-Sequencing data (scRNA-Seq) can provide information for patients' susceptibility to various diseases and conditions. In addition, such analysis can be used to identify aging related genes and pathways. To enable age prediction based on scRNA-Seq data, we developed PolyEN, a new regression model which learns continuous representation for expression over time. These representations are then used by PolyEN to integrate genes to predict an age. Existing and new lung aging data we profiled demonstrated PolyEN's improved performance over existing methods for age prediction. Our results identified lung epithelial cells as the most significant predictors for non-smokers while lung endothelial cells led to the best chronological age prediction results for smokers.
Project description:Despite the growing availability of sophisticated bioinformatic methods for the analysis of single-cell RNA-seq data, few tools exist that allow biologists without extensive bioinformatic expertise to directly visualize and interact with their own data and results. Here, we present Cerebro (cell report browser), a Shiny- and Electron-based standalone desktop application for macOS and Windows which allows investigation and inspection of pre-processed single-cell transcriptomics data without requiring bioinformatic experience of the user. Through an interactive and intuitive graphical interface, users can (i) explore similarities and heterogeneity between samples and cell clusters in two-dimensional or three-dimensional projections such as t-SNE or UMAP, (ii) display the expression level of single genes or gene sets of interest, (iii) browse tables of most expressed genes and marker genes for each sample and cluster and (iv) display trajectories calculated with Monocle 2. We provide three examples prepared from publicly available datasets to show how Cerebro can be used and which are its capabilities. Through a focus on flexibility and direct access to data and results, we think Cerebro offers a collaborative framework for bioinformaticians and experimental biologists that facilitates effective interaction to shorten the gap between analysis and interpretation of the data.Availability and implementationThe Cerebro application, additional documentation, and example datasets are available at https://github.com/romanhaa/Cerebro. Similarly, the cerebroApp R package is available at https://github.com/romanhaa/cerebroApp. All components are released under the MIT License.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Sample multiplexing enables pooled analysis during single-cell RNA sequencing workflows, thereby increasing throughput and reducing batch effects. A challenge for all multiplexing techniques is to link sample-specific barcodes with cell-specific barcodes, then demultiplex sample identity post-sequencing. However, existing demultiplexing tools fail under many real-world conditions where barcode cross-contamination is an issue. We therefore developed deMULTIplex2, an algorithm inspired by a mechanistic model of barcode cross-contamination. deMULTIplex2 employs generalized linear models and expectation-maximization to probabilistically determine the sample identity of each cell. Benchmarking reveals superior performance across various experimental conditions, particularly on large or noisy datasets with unbalanced sample compositions.
Project description:Single-cell sample multiplexing technologies function by associating sample-specific barcode tags with cell-specific barcode tags, thereby increasing sample throughput, reducing batch effects, and decreasing reagent costs. Computational methods must then correctly associate cell-tags with sample-tags, but their performance deteriorates rapidly when working with datasets that are large, have imbalanced cell numbers across samples, or are noisy due to cross-contamination among sample tags - unavoidable features of many real-world experiments. Here we introduce deMULTIplex2, a mechanism-guided classification algorithm for multiplexed scRNA-seq data that successfully recovers many more cells across a spectrum of challenging datasets compared to existing methods. deMULTIplex2 is built on a statistical model of tag read counts derived from the physical mechanism of tag cross-contamination. Using generalized linear models and expectation-maximization, deMULTIplex2 probabilistically infers the sample identity of each cell and classifies singlets with high accuracy. Using Randomized Quantile Residuals, we show the model fits both simulated and real datasets. Benchmarking analysis suggests that deMULTIplex2 outperforms existing algorithms, especially when handling large and noisy single-cell datasets or those with unbalanced sample compositions.
Project description:Single-cell technologies are emerging fast due to their ability to unravel the heterogeneity of biological systems. While scRNA-seq is a powerful tool that measures whole-transcriptome expression of single cells, it lacks their spatial localization. Novel spatial transcriptomics methods do retain cells spatial information but some methods can only measure tens to hundreds of transcripts. To resolve this discrepancy, we developed SpaGE, a method that integrates spatial and scRNA-seq datasets to predict whole-transcriptome expressions in their spatial configuration. Using five dataset-pairs, SpaGE outperformed previously published methods and showed scalability to large datasets. Moreover, SpaGE predicted new spatial gene patterns that are confirmed independently using in situ hybridization data from the Allen Mouse Brain Atlas.