Project description:We benchmarked deconvolution algorithms on human brain gene expression data. The data deposited here include A) mixtures on which algorithms were benchmarked, as well as B) signatures of pure brain cell-type expression
Project description:A large number of computational methods have been recently developed for analyzing differential gene expression (DE) in RNA-seq data. We report on a comprehensive evaluation of the commonly used DE methods using the SEQC benchmark data set and data from ENCODE project. We evaluated a number of key features including: normalization, accuracy of DE detection and DE analysis when one condition has no detectable expression. We found significant differences among the methods. Furthermore, computational methods designed for DE detection from expression array data perform comparably to methods customized for RNA-seq. Most importantly, our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth. The Sequencing Quality Control Consortium generated two datasets from two reference RNA samples in order to evaluate transcriptome profiling by next-generation sequencing technology. Each sample contains one of the reference RNA source and a set of synthetic RNAs from the External RNA Control Consortium (ERCC) at known concentrations. Group A contains 5 replicates of the Strategene Universal Human Reference RNA (UHRR), which is composed of total RNA from 10 human cell lines, with 2% by volume of ERCC mix 1. Group B includes 5 replicate samples of the Ambion Human Brain Reference RNA (HBRR) with 2% by volume of ERCC mix 2. The ERCC spike-in control is a mixture of 92 synthetic polyadenylated oligonucleotides of 250-2000 nucleotides long that are meant to resemble human transcripts.
Project description:A large number of computational methods have been recently developed for analyzing differential gene expression (DE) in RNA-seq data. We report on a comprehensive evaluation of the commonly used DE methods using the SEQC benchmark data set and data from ENCODE project. We evaluated a number of key features including: normalization, accuracy of DE detection and DE analysis when one condition has no detectable expression. We found significant differences among the methods. Furthermore, computational methods designed for DE detection from expression array data perform comparably to methods customized for RNA-seq. Most importantly, our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth.
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:RNA profiling technologies at single-cell resolutions, including single-cell and single-nuclei RNA sequencing (scRNA-Seq and snRNA-Seq, scnRNA-Seq for short), can help characterize the composition of tissues and reveal cells that influence key healthy and disease functions. However, the use of these technologies is challenging because of their relatively high costs and exacting sample collection requirements. Computational deconvolution methods that infer the composition of RNA-Seq-profiled samples using scnRNA-Seq-characterized cell types can expand the benefit of these technologies, but their effectiveness remains controversial. We produced the first systematic evaluation of deconvolution methods on datasets with either known compositions or based on concurrent RNA-Seq and scnRNA-Seq profiles. Our analyses revealed biases that are common to scnRNA-Seq 10X Genomics assays and illustrated the importance of accurate and properly controlled data preprocessing and method selection and optimization. Moreover, our results suggested that concurrent RNA-Seq and scnRNA-Seq profiles can help improve the accuracy of both scnRNA-Seq preprocessing and the deconvolution methods that employ them. Indeed, our proposed method, Single-cell RNA Quantity Informed Deconvolution (SQUID), combined RNA-Seq transformation and a dampened weighted least squares deconvolution approach to consistently outperform other methods in predicting the composition of cell mixtures and tissue samples. Moreover, our analysis suggested that only SQUID could identify outcomes-predictive cancer cell subtypes in pediatric acute myeloid leukemia and neuroblastoma datasets.
Project description:We performed a comprehensive evaluation of four different TiO2-based phosphopeptide enrichment methods using different non-phosphopeptide excluders. Then, two phosphopeptide fractionation methods, including the ammonia-based and the TEA based high pH reversed-phase fractionation, are also evaluated.
Project description:Chemical proteomics encompasses novel drug target deconvolution methods in which compound modification is not required. Herein we use Thermal Proteome Profiling, Functional Identification of Target by Expression Proteomics and multiplexed redox proteomics for deconvolution of auranofin targets to aid elucidation of its mechanisms of action. Auranofin (Ridaura®) was approved for treatment of rheumatoid arthritis in 1985. Because several clinical trials are currently ongoing to repurpose auranofin for cancer therapy, comprehensive characterization of its targets and effects in cancer cells is important. Together, our chemical proteomics tools confirmed thioredoxin reductase 1 (TXNRD1) as a main auranofin target, with perturbation of oxidoreductase pathways as the top mechanism of drug action. Additional indirect targets included NFKB2 and CHORDC1. Our comprehensive data can be used as a proteomic signature resource for further analyses of the effects of auranofin. Here we also assessed the orthogonality and complementarity of different chemical proteomics methods that can furnish invaluable mechanistic information and thus the approach can facilitate drug discovery efforts in general.
Project description:Difference in RNA content of different cell types introduces bias to gene expression deconvolution methods. If ERCC spike-ins are introduced into samples, predicted proportions of deconvolution methods can be corrected