Project description:We performed a meta analysis of publicly available TET1, 5mC, 5hmC and genome wide bisulfite profiling data mostly from mouse embryonic stem cells (ESC). Genome wide chromatin immunoprecipitation combined with deep sequencing (ChIP-seq) has revealed binding of the TET1 protein at CpG-island (CGI) promoters and at bivalent promoters. We show that TET1 also coincides with DNAseI hypersensitive sites (HS). Presence of TET1 at these THREE locations suggests that it may play a dual role: an active role at CpG-islands and DNAseI hypersensitive sites and a repressive role at bivalent loci. In line with the presence of TET1, significant enrichment of 5hmC but not 5mC is detected at bivalent promoters and DNaseI HS. Surprisingly, 5hmC is not detected or present at very low levels at CGI promoters notwithstanding the presence of TET1 at these loci. Our meta analysis suggest that asymmetric methylation is present at CA- and CT-repeats in the genome of some human ESC. Examination of the distribution of 5-methylcytosine and 5-hydroxymethylcytosine in the genome of mouse embryonic stem cells.
Project description:We present a meta-dataset comprising of a total of 178 samples including both primary tumors and tumor-free pancreatic tissues from four independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:We present a meta-dataset comprising of a total of 663 samples including both primary tumors and tumor-free ovarian tissues from ten independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:We present a meta-dataset comprising of a total of 347 samples including both primary tumors and tumor-free renal tissues from six independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:We present a meta-dataset comprising of a total of 237 samples including both primary tumors and tumor-free prostate tissues from six independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:We present a meta-dataset comprising of a total of 212 samples including both primary tumors and tumor-free bladder tissues from four independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:We present a meta-dataset comprising of a total of 214 samples including both primary tumors and tumor-free melanoma tissues from four independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:We present a meta-dataset comprising of a total of 1566 samples including both primary tumors and tumor-free colorectal tissues from 15 independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.
Project description:We present a meta-dataset comprising of a total of 737 samples including both primary tumors and tumor-free gastric tissues from seven independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.