A Random-Forest Based Algorithm for Prediction of Enhancers From Histone Modifications
Ontology highlight
ABSTRACT: Transcriptional enhancers play critical roles in regulation of gene expression, but their identification has remained a challenge. Recently, it was shown that enhancers in the mammalian genome are associated with characteristic histone modification patterns, which have been increasingly exploited for enhancer identification. However, only a limited number of histone modifications have previously been investigated for this purpose, leaving the questions answered whether there exist an optimal set of histone modifications that could improve the enhancer prediction. Here, we address this issue by exploring a rich dataset produced by the human Epigenome Roadmap Project. Specifically, we examined genome-wide profiles of 24 histone modifications in human embryonic stem cells and fibroblasts, and developed a Random-Forest based algorithm to integrate histone modification profiles for identification of enhancers.As a training set, we used histone modification profiles at genome-wide binding sites of p300 in the two cell types identified using ChIP-seq. We show that this algorithm not only leads to more accurate and precise prediction of enhancers than previous methods, but also helps identify an optimal set of three chromatin marks for enhancer prediction. ChIP-Seq Analysis of p300 in hESC H1 and IMR90 cells. Sequencing was done on the Illumina Genome Analyzer II platform for the H1 data and Illumina HiSeq for IMR90.Data was mapped to hg18 using Bowtie.
ORGANISM(S): Homo sapiens
SUBMITTER: Nisha Rajagopal
PROVIDER: E-GEOD-37858 | biostudies-arrayexpress |
REPOSITORIES: biostudies-arrayexpress
ACCESS DATA