Project description:One of the greatest challenges for the public health response to autism is providing access to evidence-based care. Sally J Rogers tells Andréia Azevedo Soares how parents can help their children mitigate the disabilities associated with autism.
Project description:Many spatially resolved transcriptomic technologies do not have single-cell resolution but measure the average gene expression for each spot from a mixture of cells of potentially heterogeneous cell types. Here, we introduce a deconvolution method, conditional autoregressive-based deconvolution (CARD), that combines cell-type-specific expression information from single-cell RNA sequencing (scRNA-seq) with correlation in cell-type composition across tissue locations. Modeling spatial correlation allows us to borrow the cell-type composition information across locations, improving accuracy of deconvolution even with a mismatched scRNA-seq reference. CARD can also impute cell-type compositions and gene expression levels at unmeasured tissue locations to enable the construction of a refined spatial tissue map with a resolution arbitrarily higher than that measured in the original study and can perform deconvolution without an scRNA-seq reference. Applications to four datasets, including a pancreatic cancer dataset, identified multiple cell types and molecular markers with distinct spatial localization that define the progression, heterogeneity and compartmentalization of pancreatic cancer.
Project description:Deconvolution of mouse transcriptomic data is challenged by the fact that mouse models carry various genetic and physiological perturbations, making it questionable to assume fixed cell types and cell type marker genes for different data set scenarios. We developed a Semi-Supervised Mouse data Deconvolution (SSMD) method to study the mouse tissue microenvironment. SSMD is featured by (i) a novel nonparametric method to discover data set-specific cell type signature genes; (ii) a community detection approach for fixing cell types and their marker genes; (iii) a constrained matrix decomposition method to solve cell type relative proportions that is robust to diverse experimental platforms. In summary, SSMD addressed several key challenges in the deconvolution of mouse tissue data, including: (i) varied cell types and marker genes caused by highly divergent genotypic and phenotypic conditions of mouse experiment; (ii) diverse experimental platforms of mouse transcriptomics data; (iii) small sample size and limited training data source and (iv) capable to estimate the proportion of 35 cell types in blood, inflammatory, central nervous or hematopoietic systems. In silico and experimental validation of SSMD demonstrated its high sensitivity and accuracy in identifying (sub) cell types and predicting cell proportions comparing with state-of-the-arts methods. A user-friendly R package and a web server of SSMD are released via https://github.com/xiaoyulu95/SSMD.
Project description:ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold's 40-60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com .
Project description:Computational deconvolution with single-cell RNA sequencing data as reference is pivotal to interpreting spatial transcriptomics data, but the current methods are limited to cell-type resolution. Here we present Redeconve, an algorithm to deconvolute spatial transcriptomics data at single-cell resolution, enabling interpretation of spatial transcriptomics data with thousands of nuanced cell states. We benchmark Redeconve with the state-of-the-art algorithms on diverse spatial transcriptomics platforms and datasets and demonstrate the superiority of Redeconve in terms of accuracy, resolution, robustness, and speed. Application to a human pancreatic cancer dataset reveals cancer-clone-specific T cell infiltration, and application to lymph node samples identifies differential cytotoxic T cells between IgA+ and IgG+ spots, providing novel insights into tumor immunology and the regulatory mechanisms underlying antibody class switch.
Project description:Many computational methods have been developed to infer cell type proportions from bulk transcriptomics data. However, an evaluation of the impact of data transformation, pre-processing, marker selection, cell type composition and choice of methodology on the deconvolution results is still lacking. Using five single-cell RNA-sequencing (scRNA-seq) datasets, we generate pseudo-bulk mixtures to evaluate the combined impact of these factors. Both bulk deconvolution methodologies and those that use scRNA-seq data as reference perform best when applied to data in linear scale and the choice of normalization has a dramatic impact on some, but not all methods. Overall, methods that use scRNA-seq data have comparable performance to the best performing bulk methods whereas semi-supervised approaches show higher error values. Moreover, failure to include cell types in the reference that are present in a mixture leads to substantially worse results, regardless of the previous choices. Altogether, we evaluate the combined impact of factors affecting the deconvolution task across different datasets and propose general guidelines to maximize its performance.
Project description:Spatially resolved transcriptomics (SRT) has transformed tissue biology by linking gene expression profiles with spatial information. However, sequencing-based SRT methods aggregate signals from multiple cell types within capture locations ("spots"), masking cell-type-specific gene expression patterns. Traditional cell-type deconvolution methods estimate cell compositions within spots but fail to resolve cell-type-specific gene expression, limiting their ability to uncover critical biological processes such as cellular interactions and microenvironmental dynamics. Here, we present STged (spatial transcriptomic gene expression deconvolution), a novel computational framework that goes beyond traditional deconvolution by reconstructing cell-type-specific gene expression profiles from mixed spots. STged integrates graph-based spatial correlations and reference-derived gene signatures using a non-negative least-squares regression framework, achieving precise and biologically meaningful deconvolution. Comprehensive simulations show that STged consistently outperforms existing methods in accuracy and robustness. Applications to human pancreatic ductal adenocarcinoma and human squamous cell carcinoma datasets reveal its capacity to identify microenvironment-specific highly variable genes, reconstruct spatial cell-cell communication networks, and resolve tissue architecture at near-single-cell resolution. In mouse kidney tissues, STged uncovers dynamic spatial gene expression patterns and distinct gene programs, advancing our understanding of tissue heterogeneity and cellular dynamics.
Project description:BackgroundFocal segmental glomerulosclerosis is a histopathological pattern of renal injury and comprises a heterogeneous group of clinical conditions with different pathophysiology, clinical course, prognosis, and treatment. Nevertheless, subtype differentiation in clinical practice often remains challenging, and we currently lack reliable diagnostic, prognostic, and therapeutic biomarkers. The advent of new transcriptomics techniques in kidney research poses great potential in the identification of gene expression biomarkers that can be applied in clinical practice.SummaryTranscriptomics techniques have been completely revolutionized in the last 2 decades, with the evolution from low-throughput reverse-transcription polymerase chain reaction and in situ hybridization techniques to microarrays and next-generation sequencing techniques, including RNA-sequencing and single-cell transcriptomics. The integration of human gene expression profiles with functional in vitro and in vivo experiments provides a deeper mechanistic insight into the candidate genes, which enable the development of novel-targeted therapies. The correlation of gene expression profiles with clinical outcomes of large patient cohorts allows for the development of clinically applicable biomarkers that can aid in diagnosis and predict prognosis and therapy response. Finally, the integration of transcriptomics with other "omics" modalities creates a holistic view on disease pathophysiology.Key messagesNew transcriptomics techniques allow high-throughput gene expression profiling of patients with focal segmental glomerulosclerosis (FSGS). The integration with clinical outcomes and fundamental mechanistic studies enables the discovery of new clinically useful biomarkers that will finally improve the clinical outcome of patients with FSGS.
Project description:Phylogenetics is a powerful tool for analyzing protein sequences, by inferring their evolutionary relationships to other proteins. However, phylogenetics analyses can be challenging: they are computationally expensive and must be performed carefully in order to avoid systematic errors and artifacts. Protein Analysis THrough Evolutionary Relationships (PANTHER; http://pantherdb.org) is a publicly available, user-focused knowledgebase that stores the results of an extensive phylogenetic reconstruction pipeline that includes computational and manual processes and quality control steps. First, fully reconciled phylogenetic trees (including ancestral protein sequences) are reconstructed for a set of "reference" protein sequences obtained from fully sequenced genomes of organisms across the tree of life. Second, the resulting phylogenetic trees are manually reviewed and annotated with function evolution events: inferred gains and losses of protein function along branches of the phylogenetic tree. Here, we describe in detail the current contents of PANTHER, how those contents are generated, and how they can be used in a variety of applications. The PANTHER knowledgebase can be downloaded or accessed via an extensive API. In addition, PANTHER provides software tools to facilitate the application of the knowledgebase to common protein sequence analysis tasks: exploring an annotated genome by gene function; performing "enrichment analysis" of lists of genes; annotating a single sequence or large batch of sequences by homology; and assessing the likelihood that a genetic variant at a particular site in a protein will have deleterious effects.