Genomics

Dataset Information

0

HyDrop v2: Scalable atlas construction for training sequence-to-function models


ABSTRACT: Deciphering cis-regulatory logic underlying cell type identity is a fundamental question in biology. Single-cell chromatin accessibility (scATAC-seq) data has enabled training of sequence-to-function deep learning models allowing decoding of enhancer logic and design of synthetic enhancers. Training such models requires large amounts of high-quality training data across species, organs, development, aging, and disease. To facilitate the cost-effective generation of large scATAC-seq atlases for model training, we developed a new version of the open-source microfluidic system HyDrop with increased sensitivity and scale: HyDrop v2. We generated HyDrop-v2 atlases for the mouse cortex and Drosophila embryo development and compared them to atlases generated on commercial platforms. HyDrop-v2 data integrates seamlessly with commercially available chromatin accessibility methods (10x Genomics). Differentially accessible regions and motif enrichment across cell types are equivalent between HyDrop-v2 and 10x atlases. Sequence-to-function models trained on either atlas are comparable as well in terms of enhancer predictions, sequence explainability, and transcription factor footprinting. By offering accessible data generation, enhancer models trained on HyDrop-v2 and mixed atlases can contribute to unraveling cell-type specific regulatory elements in health and disease.

ORGANISM(S): Mus musculus Drosophila melanogaster

PROVIDER: GSE293575 | GEO | 2025/04/02

REPOSITORIES: GEO

Dataset's files

Source:
Action DRS
Other
Items per page:
1 - 1 of 1

Similar Datasets

2025-03-28 | GSE292617 | GEO
| PRJNA1245282 | ENA
2021-08-17 | E-MTAB-9650 | biostudies-arrayexpress
2007-10-12 | E-GEOD-854 | biostudies-arrayexpress
2014-07-17 | E-GEOD-42967 | biostudies-arrayexpress
2023-11-19 | GSE240003 | GEO
2020-04-22 | GSE142238 | GEO
2014-07-17 | E-GEOD-44395 | biostudies-arrayexpress
2011-12-15 | E-GEOD-28998 | biostudies-arrayexpress
2024-08-31 | GSE246859 | GEO