Unknown

Dataset Information

0

ModuleDigger: an itemset mining framework for the detection of cis-regulatory modules.


ABSTRACT:

Background

The detection of cis-regulatory modules (CRMs) that mediate transcriptional responses in eukaryotes remains a key challenge in the postgenomic era. A CRM is characterized by a set of co-occurring transcription factor binding sites (TFBS). In silico methods have been developed to search for CRMs by determining the combination of TFBS that are statistically overrepresented in a certain geneset. Most of these methods solve this combinatorial problem by relying on computational intensive optimization methods. As a result their usage is limited to finding CRMs in small datasets (containing a few genes only) and using binding sites for a restricted number of transcription factors (TFs) out of which the optimal module will be selected.

Results

We present an itemset mining based strategy for computationally detecting cis-regulatory modules (CRMs) in a set of genes. We tested our method by applying it on a large benchmark data set, derived from a ChIP-Chip analysis and compared its performance with other well known cis-regulatory module detection tools.

Conclusion

We show that by exploiting the computational efficiency of an itemset mining approach and combining it with a well-designed statistical scoring scheme, we were able to prioritize the biologically valid CRMs in a large set of coregulated genes using binding sites for a large number of potential TFs as input.

SUBMITTER: Sun H 

PROVIDER: S-EPMC2648767 | biostudies-literature | 2009 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

ModuleDigger: an itemset mining framework for the detection of cis-regulatory modules.

Sun Hong H   De Bie Tijl T   Storms Valerie V   Fu Qiang Q   Dhollander Thomas T   Lemmens Karen K   Verstuyf Annemieke A   De Moor Bart B   Marchal Kathleen K  

BMC bioinformatics 20090130


<h4>Background</h4>The detection of cis-regulatory modules (CRMs) that mediate transcriptional responses in eukaryotes remains a key challenge in the postgenomic era. A CRM is characterized by a set of co-occurring transcription factor binding sites (TFBS). In silico methods have been developed to search for CRMs by determining the combination of TFBS that are statistically overrepresented in a certain geneset. Most of these methods solve this combinatorial problem by relying on computational in  ...[more]

Similar Datasets

| S-EPMC4182448 | biostudies-literature
| S-EPMC2697649 | biostudies-literature
| S-EPMC1796902 | biostudies-other
| S-EPMC4364064 | biostudies-literature
| S-EPMC3694643 | biostudies-literature
| S-EPMC139975 | biostudies-literature
| S-EPMC2896114 | biostudies-literature
| S-EPMC3541939 | biostudies-literature
| S-EPMC1665632 | biostudies-literature
| S-EPMC6104691 | biostudies-other