Unknown

Dataset Information

0

Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling.


ABSTRACT: Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatinized DNA. Tn5 has a complex sequence bias that is not effectively scaled with traditional bias-correction methods. We model this complex bias using a rule ensemble machine learning approach that integrates information from many input k-mers proximal to the ATAC sequence reads. We effectively characterize and correct single-nucleotide sequence biases and regional sequence biases of the Tn5 enzyme. Correction of enzymatic sequence bias is an important step in interpreting chromatin accessibility assays that aim to infer transcription factor binding and regulatory activity of elements in the genome.

SUBMITTER: Wolpe JB 

PROVIDER: S-EPMC10236359 | biostudies-literature | 2023 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling.

Wolpe Jacob B JB   Martins André L AL   Guertin Michael J MJ  

NAR genomics and bioinformatics 20230602 2


Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatinized DNA. Tn5 has a complex sequence bias that is not effectively scaled with traditional bias-correction methods. We model this complex bias using a rule ensemble machine learning approach that integr  ...[more]

Similar Datasets

| S-EPMC6517279 | biostudies-literature
| S-EPMC6385462 | biostudies-literature
| S-EPMC8245295 | biostudies-literature
2016-12-21 | GSE92674 | GEO
| S-EPMC7192442 | biostudies-literature
| S-EPMC3042188 | biostudies-literature
| S-EPMC5725464 | biostudies-literature
| S-EPMC10776385 | biostudies-literature
| S-EPMC7862270 | biostudies-literature
| S-EPMC9001676 | biostudies-literature