Unknown

Dataset Information

0

ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery.


ABSTRACT: MOTIVATION:The availability of numerous ChIP-seq datasets for transcription factors (TF) has provided an unprecedented opportunity to identify all TF binding sites in genomes. However, the progress has been hindered by the lack of a highly efficient and accurate tool to find not only the target motifs, but also cooperative motifs in very big datasets. RESULTS:We herein present an ultrafast and accurate motif-finding algorithm, ProSampler, based on a novel numeration method and Gibbs sampler. ProSampler runs orders of magnitude faster than the fastest existing tools while often more accurately identifying motifs of both the target TFs and cooperators. Thus, ProSampler can greatly facilitate the efforts to identify the entire cis-regulatory code in genomes. AVAILABILITY AND IMPLEMENTATION:Source code and binaries are freely available for download at https://github.com/zhengchangsulab/prosampler. It was implemented in C++ and supported on Linux, macOS and MS Windows platforms. SUPPLEMENTARY INFORMATION:Supplementary materials are available at Bioinformatics online.

SUBMITTER: Li Y 

PROVIDER: S-EPMC6853706 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery.

Li Yang Y   Ni Pengyu P   Zhang Shaoqiang S   Li Guojun G   Su Zhengchang Z  

Bioinformatics (Oxford, England) 20191101 22


<h4>Motivation</h4>The availability of numerous ChIP-seq datasets for transcription factors (TF) has provided an unprecedented opportunity to identify all TF binding sites in genomes. However, the progress has been hindered by the lack of a highly efficient and accurate tool to find not only the target motifs, but also cooperative motifs in very big datasets.<h4>Results</h4>We herein present an ultrafast and accurate motif-finding algorithm, ProSampler, based on a novel numeration method and Gib  ...[more]

Similar Datasets

| S-EPMC3106185 | biostudies-literature
| S-EPMC10074035 | biostudies-literature
| S-EPMC3287167 | biostudies-literature
| S-EPMC3371830 | biostudies-literature
| S-EPMC5468353 | biostudies-literature
| S-EPMC3326300 | biostudies-literature
| S-EPMC3429929 | biostudies-literature
| S-EPMC6589551 | biostudies-literature
| S-EPMC4294205 | biostudies-literature
| S-EPMC4053760 | biostudies-literature