Project description:Background and aim: Analysis of data obtained from genome wide gene expression experiments is challenging, due to the huge amount of variables, management of the data and the need for multivariate analysis. We here present the R package: pcaGoPromoter that facilitates the interpretation of genome wide expression data to overcome these problems. In a first step principal component analysis is applied to overview any differences between the observations and possible groupings. The next step is interpretation of the principal components with respect to both biological function and involvement of predicted transcription factor binding sites. The robustness of the results is evaluated using cross validation. Illustrative plots of PCA score plots and Gene Ontology terms are available. To illustrate the functionality of the R package, we designed a serum stimulation experiment, where the main biological outcome is well documented. Results: Samples from the serum stimulation experiment were analyzed using the Affymetrix Human Genome U133 Plus 2.0 chip. The array data were analyzed by the tools of the pcaGoPromoter package, which resulted in a clear separation of the observations into the three experimental groups - controls, serum only and serum with inhibitor. The functional annotation of the axes in the PCA score plot showed the expected serum promoted biological processes such as cell cycle progression and the predicted involvement of the expected transcription factors including E2F. In addition unexpected results, e.g. the cholesterol synthesis in serum depleted cells and NF-κB activation in inhibitor treated cells were uncovered. Conclusion: The pcaGoPromoter R package provides a collection of tools for analyzing gene expression data. It works with any platform using gene symbols or Entrez Ids as probe identifiers. In addition support for several popular Affymetrix GeneChip platforms is provided. The tools give an overview of the data via principal component analysis, functional interpretation by Gene Ontology terms (biological processes), and indication of involvement of possible transcription factors. Thus, pcaGoPromoter structures the high-dimensional data of gene expression experiments and can be applied to generate hypotheses for further exploration.

Project description:A cDNA-microarray was designed and used to monitor the transcriptomic profile of Dehalococcoides mccartyi strain 195 (in a mixed community) respiring various chlorinated organics, including chloroethenes and 2,3-dichlorophenol. The cultures were continuously fed in order to establish steady-state respiration rates and substrate levels. The organization of array data into a clustered heat map revealed two major experimental partitions. This partitioning in the data set was further explored through principal component analysis. The first two principal components separated the experiments into those with slow (1.6 plus or minus 0.6 M Cl- per h) and fast (22.9 plus or minus 9.6 M Cl- per h) respiring cultures. Additionally, the transcripts with the highest loadings in these principal components were identified, suggesting that those transcripts were responsible for the partitioning of the experiments. By analyzing the transcriptomes (n = 53) across experiments, relationships among transcripts were identified, and hypotheses about the relationships between electron transport chain members were proposed. One hypothesis, that the hydrogenases Hup and Hym and the formate dehydrogenase-like oxidoreductase (DET0186–DET0187) form a complex (as displayed by their tight clustering in the heat map analysis), was explored using a nondenaturing protein separation technique combined with proteomic sequencing. Although these proteins did not migrate as a single complex, DET0112 (an FdhB-like protein encoded in the Hup operon) was found to comigrate with DET0187 rather than with the catalytic Hup subunit DET0110. On closer inspection of the genome annotations of all Dehalococcoides strains, the DET0185-to-DET0187 operon was found to lack a key subunit, an FdhB-like protein. Therefore, on the basis of the transcriptomic, genomic, and proteomic evidence, the place of the missing subunit in the DET0185-to-DET0187 operon is likely filled by recruiting a subunit expressed from the Hup operon (DET0112).

Dataset Information

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets