High dimensional association detection in large-scale genomic data
Ontology highlight
ABSTRACT: Joint analyses of genomic datasets obtained in multiple different conditions are essential for understanding the biological mechanism that drives tissue-specificity and cell differentiation, but they still remain computationally challenging. To address this we introduce CLIMB (Composite LIkelihood eMpirical Bayes), a statistical methodology that learns patterns of condition-specificity present in genomic data. CLIMB provides a generic framework facilitating a host of analyses, such as clustering genomic features sharing similar condition-specific patterns and identifying which of these features are involved in cell fate commitment. Our approach improves upon existing methods by boosting statistical power to identify meaningful signals while retaining interpretability and computational tractability. We illustrate CLIMB's value on two sets of hematopoietic data: one studying CTCF ChIP-seq measured in 17 different cell populations, and another examining RNA-seq measured across constituent cell populations in three committed lineages. These analyses demonstrate that CLIMB captures biologically relevant clusters in the data and improves upon commonly-used pairwise comparisons and unsupervised clusterings typical of genomic analyses.
ORGANISM(S): Mus musculus Homo sapiens
PROVIDER: GSE156074 | GEO | 2020/11/18
REPOSITORIES: GEO
ACCESS DATA