Ontology highlight
ABSTRACT: Motivation
Research on epigenetic modifications and other chromatin features at genomic regulatory elements elucidates essential biological mechanisms including the regulation of gene expression. Despite the growing number of epigenetic datasets, new tools are still needed to discover novel distinctive patterns of heterogeneous epigenetic signals at regulatory elements.Results
We introduce ChromDMM, a product Dirichlet-multinomial mixture model for clustering genomic regions that are characterized by multiple chromatin features. ChromDMM extends the mixture model framework by profile shifting and flipping that can probabilistically account for inaccuracies in the position and strand-orientation of the genomic regions. Owing to hyper-parameter optimization, ChromDMM can also regularize the smoothness of the epigenetic profiles across the consecutive genomic regions. With simulated data, we demonstrate that ChromDMM clusters, shifts and strand-orients the profiles more accurately than previous methods. With ENCODE data, we show that the clustering of enhancer regions in the human genome reveals distinct patterns in several chromatin features. We further validate the enhancer clusters by their enrichment for transcriptional regulatory factor binding sites.Availability and implementation
ChromDMM is implemented as an R package and is available at https://github.com/MariaOsmala/ChromDMM.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Osmala M
PROVIDER: S-EPMC9364382 | biostudies-literature | 2022 Aug
REPOSITORIES: biostudies-literature
Osmala Maria M Eraslan Gökçen G Lähdesmäki Harri H
Bioinformatics (Oxford, England) 20220801 16
<h4>Motivation</h4>Research on epigenetic modifications and other chromatin features at genomic regulatory elements elucidates essential biological mechanisms including the regulation of gene expression. Despite the growing number of epigenetic datasets, new tools are still needed to discover novel distinctive patterns of heterogeneous epigenetic signals at regulatory elements.<h4>Results</h4>We introduce ChromDMM, a product Dirichlet-multinomial mixture model for clustering genomic regions that ...[more]