Project description:Transcriptional profiling of pre-malignant and malignant colorectal cancer lesions provides a means for temporally monitoring key molecular events underlying neoplastic progression. Unfortunately, the most widely used central dataset for colorectal cancer samples from The Cancer Genome Atlas (TCGA) does not contain adenoma samples, putting a greater reliance of in silico analyses and pre-clinical modelling on a handful of independent microarray experiments. Due to the differences in sample acquisition, preparation, downstream analysis and other parameters, results are often incongruent, hindering consensus building. Here, we developed a microarray meta-dataset consisting of 231 normal, 132 adenoma, and 342 colon cancer tissue samples (705 samples total) sourced from 12 independent microarray studies all using the Affymetrix HG U133 Plus 2.0 (GPL570) chip platform including GSE4183, GSE8671,GSE9348, GSE15960, GSE20916, GSE21510, GSE22598, GSE23194, GSE23878, GSE32323, GSE33113, and GSE37364. Individual datasets were pre-processed and normalized by frozen robust multiarray averaging (fRMA) before merging by matching probe sets. Batch effects were subsequently identified by Principal Component Analysis (PCA) and removed using ComBat. In addition, low variant probes were filtered from the meta-dataset before downstream analysis. Finally, biological signatures corresponding to cancer and adenoma samples were both quantitatively and functionally validated. Quantitative validation was performed by correlation analysis of LogFC values with the TCGA-COAD or other external GEO microarray datasets, respectively. Functional validation was carried out through predictive analyses using Ingenuity Pathway Analysis (IPA) and Gene Set Enrichment Analysis (GSEA). Overall, our meta-dataset provides a powerful tool for studying transcriptome-wide changes which occur during early dysplasia and malignant transformation of adenomas as well as colorectal cancer in general.
Project description:We present a meta-dataset comprising of a total of 1566 samples including both primary tumors and tumor-free colorectal tissues from 15 independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:Whole transcriptome expression levels of healthy colonic, colorectal adenoma and colorectal cancer biopsy samples were analyzed by HTA 2.0 microarrays
Project description:YAP1 plays importance roles in development of colorectal cancer as evidenced by their overexpression in colorectal cancer and their expression promoted cell proliferation and survival of colorectal cancer cells. In order to understand potential roles of YAP1 in colorectal cancer, we over-expressed constitutively active YAP1 mutant in NCI-H716 colorectal cancer cells and identified and analyzed genes whose expression is activated by YAP1 activation in colorectal cancer. Pre-clinical study