Project description:Histone modifications are a key epigenetic mechanism to activate or repress the expression of genes. Data sets of matched microarray expression data and histone modification data measured by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we present a novel bioinformatic approach to detect genes that are differentially expressed between two conditions putatively caused by alterations in histone modification. We introduce a correlation measure for integrative analysis of ChIP-seq and gene expression data and demonstrate that a proper normalization of the ChIP-seq data is crucial. We suggest applying Bayesian mixture models of different distributions to further study the distribution of the correlation measure. The implicit classification of the mixture models is used to detect genes with differences between two conditions in both gene expression and histone modification. The method is applied to different data sets and its superiority to a naive separate analysis of both data types is demonstrated. This GEO series contains the expression data of the Cebpa example data set.
Project description:Histone modifications are a key epigenetic mechanism to activate or repress the expression of genes. Data sets of matched microarray expression data and histone modification data measured by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we present a novel bioinformatic approach to detect genes that are differentially expressed between two conditions putatively caused by alterations in histone modification. We introduce a correlation measure for integrative analysis of ChIP-seq and gene expression data and demonstrate that a proper normalization of the ChIP-seq data is crucial. We suggest applying Bayesian mixture models of different distributions to further study the distribution of the correlation measure. The implicit classification of the mixture models is used to detect genes with differences between two conditions in both gene expression and histone modification. The method is applied to different data sets and its superiority to a naive separate analysis of both data types is demonstrated. This GEO series contains the expression data of the Cebpa example data set. This data set was derived from sorted Cebpafl/fl and Cebpafl/fl;Mx1Cre murine hematopoietic LSKCD150- 18 post pIpC injections (conditional deletion of Cebpa). The specimens from three Cebpafl/fl and three Cebpafl/fl;Mx1Cre mice were hybridized separately on six Affymetrix Mouse Gene 1.0 ST arrays. Associated histone modification ChIP-seq data is provided by series GSE43007.
Project description:As systems biology approaches to virology have become more tractable, it has become possible to analyze highly studied viruses such as HIV in new, unbiased ways, including spatial proteomics. We have employed here a differential centrifugation protocol to fractionate an inducible model of HIV-expression in Jurkat T cells for proteomic analysis by mass spectrometry. Using these proteomics data, we evaluated the merits of several reported machine learning pipelines for classification of the spatial proteome and identification of protein translocations. From these analyses we found that classifier performance was organelle-dependent, with Bayesian t-augmented Gaussian mixture modeling outperforming support vector machine (SVM) learning for mitochondrial and ER proteins, but underperforming on cytosolic, nuclear, and plasma membrane proteins by QSep analysis. We also observed a generally higher performance for protein translocation identification using a Bayesian model, BANDLE, on SVM-classified data. Comparative BANDLE analysis of WT and ΔNef models also identified known Nef-dependent interactors such as TCR signaling and coatomer complex. Lastly, we found that SVM classification showed higher consistency and was less sensitive to HIV-dependent noise in our data. These findings illustrate important considerations for future studies of the spatial proteome following viral infection or expression where their generalizability can be further assessed.
Project description:Analysis of primary esophageal squamous cell carcinoma (ESCC) from 71 patients in japan. Integrative analysis of gene expression profiles and genomic alterations obtained from array-CGH and NGS provided us new insight into the pathogenesis of ESCC Gene expression levels obtained from 71 microdissected ESCC tumors. We used the commercially available Human Whole Genome Oligo DNA Microarray Kit (Agilent Technologies). Labeled cRNAs were fragmented and hybridized to an oligonucleotide microarray (Whole Human Genome 4×44K Agilent G4112F). Fluorescence intensities were determined with an Agilent DNA Microarray Scanner. The gene expression profiles (GE) obtained from microarray data were quintile normalized. The batch effect in microarray experiments was also adjusted by an empirical Bayesian approach