Unknown

Dataset Information

0

SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering.


ABSTRACT:

Background

The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops. Identifying such hierarchical structures is a critical step in understanding genome regulation. Existing tools for TAD calling are frequently sensitive to biases in Hi-C data, depend on tunable parameters, and are computationally inefficient.

Methods

To address these challenges, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification.

Results

Our method, implemented in an R package, SpectralTAD, detects hierarchical, biologically relevant TADs, has automatic parameter selection, is robust to sequencing depth, resolution, and sparsity of Hi-C data. SpectralTAD outperforms four state-of-the-art TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. In contrast, boundaries of TADs that cannot be split into sub-TADs showed less enrichment and conservation, suggesting their more dynamic role in genome regulation.

Conclusion

SpectralTAD is available on Bioconductor, http://bioconductor.org/packages/SpectralTAD/ .

SUBMITTER: Cresswell KG 

PROVIDER: S-EPMC7372752 | biostudies-literature | 2020 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering.

Cresswell Kellen G KG   Stansfield John C JC   Dozmorov Mikhail G MG  

BMC bioinformatics 20200720 1


<h4>Background</h4>The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops. Identifying such hierarchical structures is a critical step in understanding genome regulation. Existing tools for TAD calling are frequently sensitive to biases in Hi-C data, depend on tunable  ...[more]

Similar Datasets

| S-EPMC7076128 | biostudies-literature
| S-EPMC6980758 | biostudies-literature
2019-12-04 | GSE139810 | GEO
2014-03-26 | E-GEOD-49111 | biostudies-arrayexpress
| S-EPMC7394310 | biostudies-literature
2014-03-26 | GSE49111 | GEO
| S-EPMC4150797 | biostudies-literature
| S-EPMC8037919 | biostudies-literature
| S-EPMC6648328 | biostudies-literature
| S-EPMC3534501 | biostudies-literature