Project description:Single-cell RNA-sequencing (scRNA-seq) has quickly become an empowering technology to profile the transcriptomes of individual cells on a large scale. Many early analyses of differential expression have aimed at identifying differences between cell types (or clusters), and thus are focused on finding markers for cell populations either in a single sample or across multiple samples. More generally, such methods can compare expression levels in multiple sets of cells, thus leading to cross-condition analyses. However, given the emergence of replicated multi-condition scRNA-seq datasets, an area of increasing focus is making sample-level inferences, termed here as differential state analysis. For example, one could investigate the condition-specific responses of specific immune cell subsets across cells measured from patients within each condition, however, it is not clear which statistical framework best handles this situation. In this work, we surveyed the methods available to perform cross-condition differential state analyses, including cell-level mixed models and methods based on aggregated ``pseudobulk'' data. We developed a flexible simulation platform that mimics both single and multi-sample scRNA-seq data and provide robust tools for multi-condition analysis within the R package.
Project description:Whole blood is a highly convenient and informative tissue from which to sample DNA and RNA in epigenomic and functional genomic studies, but it is comprised of multiple distinct cell types and this complexity significantly impairs our ability to interpret downstream differential methylation and/or differential expression results. In this multiple sclerosis (MS)-focused study we utilised an application of current statistical deconvolution methods to interrogate whole blood DNA methylation data thereby enabling the methylome of several immune cell types to be analysed independently. Methylome profiling on cell type-purified blood samples revealed optimal CpG sets for use as robust immune cell markers in the statistical deconvolution process. We show that it is possible to identify differentially methylated (DM) loci in a cell type specific manner using statistical deconvolution. Finally, we demonstrate that deconvolution improved the biological relevance and interpretability of our DM results, significantly enhancing concordance of the identified DM loci with loci previously shown to be genetically or epigenetically associated with MS.
Project description:Motivation: Detection of changes in DNA-protein interactions from ChIP-seq data is a crucial step in unraveling the regulatory networks behind biological processes. The simplest variation of this problem is the differential peak calling problem. Here one has to find genomic regions with ChIP-seq signal changes between two cellular conditions in the interaction of a protein with DNA. The great majority of peak calling methods can only analyse one ChIP-seq signal at a time and are unable to perform differential peak calling. Recently, a few approaches based on the combination of these peak callers with statistical tests for detecting differential digital expression have been proposed. However, these methods fail to detect detailed changes of protein-DNA interactions. Results: We propose ODIN; an HMM-based approach to detect and analyse differential peaks in pairs of ChIP-seq data. ODIN performs genomic signal processing, peak calling and p-value calculation in an integrated framework. We also propose an evaluation methodology to compare ODIN with competing methods. The evaluation method is based on the association of differential peaks with expression changes in the same cellular conditions. Our empirical study based on several ChIP-seq experiments from transcription factors, histone modifications and simulated data shows that ODIN outperforms considered competing methods in most scenarios. H3K4me1 and PU.1 occupancy in MPP, CDP, cDC and pDC
Project description:Purpose: mRNA translation into protein is highly regulated, but the role of mRNA isoforms, noncoding RNAs (ncRNAs), and genetic variants has yet to be systematically studied. Using high-throughput sequencing (RNA-seq), we have measured cellular levels of mRNAs and ncRNAs, and their isoforms, in lymphoblast cell lines (LCL) and in polysomal fractions, the latter shown to yield strong correlations of mRNAs with expressed protein levels. Analysis of allelic RNA ratios at heterozygous SNPs served to reveal genetic factors in ribosomal loading. Methods: RNA-seq was performed on cytosolic extracts and polysomal fractions (3 ribosomes or more) from three lymphoblastoid cell lines. As each RNA fraction was amplified (NuGen kit), and relative contributions from various RNA classes differed between cytosol and polysomes, the fraction of any given RNA species loaded onto polysomes was difficult to quantitate. Therefore, we focused on relative recovery of the various RNA classes and rank order of single RNAs compared to total RNA. Results: RNA-seq of coding and non-coding RNAs (including microRNAs) in three LCLs revealed significant differences in polysomal loading of individual RNAs and isoforms, and between RNA classes. Moreover, correlated distribution between protein-coding and non-coding RNAs suggests possible interactions between them. Allele-selective RNA recruitment revealed strong genetic influence on polysomal loading for multiple RNAs. Allelic effects can be attributed to generation of different RNA isoforms before polysomal loading or to differential loading onto polysomes, the latter defining a direct genetic effect on translation. Several variants and genes identified by this approach are also associated with RNA expression and clinical phenotypes in various databases. Conclusions: These results provide a novel approach using complete transcriptome RNA-seq to study polysomal RNA recruitment and regulatory variants affecting protein translation. cells from 3 samples were grown to 5x105 cells/mL density in T75 tissue culture flask and harvested, total RNA and polysome bound RNA was sequenced by Ion Proton
Project description:In the rapidly moving proteomics field, a diverse patchwork of algorithms for data normalization and differential expression analysis is used by the community. We generated an all-inclusive mass spectrometry downstream analysis pipeline (MS-DAP) that integrates many algorithms for normalization and statistical analyses and produces standardized quality reporting with extensive data visualizations. Second, systematic evaluation of normalization and statistical algorithms on various benchmarking datasets, including additional data generated in this study, suggest best-practices for data analysis. Commonly used approaches for differential testing based on moderated t-statistics are consistently outperformed by more recent statistical models, all integrated in MS-DAP, and we encourage their adoption. Third, we introduced a novel normalization algorithm that rescues deficiencies observed in commonly used normalization methods. Finally, we used the MS-DAP platform to re-analyze a recently published large-scale proteomics dataset of CSF from AD patients. This revealed increased sensitivity, resulting in additional significant target proteins which improved overlap with results reported in related studies and includes a large set of new potential AD biomarkers in addition to previously reported.
Project description:Motivation: Detection of changes in DNA-protein interactions from ChIP-seq data is a crucial step in unraveling the regulatory networks behind biological processes. The simplest variation of this problem is the differential peak calling problem. Here one has to find genomic regions with ChIP-seq signal changes between two cellular conditions in the interaction of a protein with DNA. The great majority of peak calling methods can only analyse one ChIP-seq signal at a time and are unable to perform differential peak calling. Recently, a few approaches based on the combination of these peak callers with statistical tests for detecting differential digital expression have been proposed. However, these methods fail to detect detailed changes of protein-DNA interactions. Results: We propose ODIN; an HMM-based approach to detect and analyse differential peaks in pairs of ChIP-seq data. ODIN performs genomic signal processing, peak calling and p-value calculation in an integrated framework. We also propose an evaluation methodology to compare ODIN with competing methods. The evaluation method is based on the association of differential peaks with expression changes in the same cellular conditions. Our empirical study based on several ChIP-seq experiments from transcription factors, histone modifications and simulated data shows that ODIN outperforms considered competing methods in most scenarios.
Project description:We performed the GeneChip analysis to identify multiple extracellular determinants such as cytokines, cell membrane-bound molecules, and matrix responsible for cardiomyogenic differentiation, and evaluated the statistical significance of differential gene expression by the NIA array analysis (http://lgsun.grc.nia.nih.gov/ANOVA/) (Bioinformatics 21: 2548), a web-based tool for microarrays data analysis. Keywords: Grem1-induced cardiogenesis via Wnt
Project description:Blood is body fluid that contains multiple types of immune cells. Gene expression profiles of blood can reflect physiopathological status of the immune system. This study was to explore the dynamic expression of whole blood microRNAs across different developmental stages, and how such differential expression relates to immune system development.
Project description:It is well documented that patients affected by rheumatoid arthritis (RA) have distinct susceptibility to the different biologic Disease-Modifying AntiRheumatic Drugs (bDMARDs) available on the market, probably because of the many facets of the disease. Monocytes are deeply involved in the pathogenesis of RA and we therefore evaluated and compared the transcriptomic profile of monocytes isolated from patients on treatment with methotrexate alone or in combination with tocilizumab, anti-TNFalpha or abatacept, and from healthy donors. Differential expression analysis of whole-genome transcriptomics yielded a list of regulated genes suitable for functional annotation enrichment analysis. Specifically, abatacept, tocilizumab and anti-TNFalpha cohorts were separately compared with methotrexate using a rank-product-based statistical approach, leading to the identification of 78, 6, and 436 differentially expressed genes, respectively.