Unknown

Dataset Information

0

Binary Interval Search: a scalable algorithm for counting interval intersections.


ABSTRACT: The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features are crucial for future discovery.We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures, such as graphics processing units by illustrating its utility for efficient Monte Carlo simulations measuring the significance of relationships between sets of genomic intervals.https://github.com/arq5x/bits.

SUBMITTER: Layer RM 

PROVIDER: S-EPMC3530906 | biostudies-other | 2013 Jan

REPOSITORIES: biostudies-other

altmetric image

Publications

Binary Interval Search: a scalable algorithm for counting interval intersections.

Layer Ryan M RM   Skadron Kevin K   Robins Gabriel G   Hall Ira M IM   Quinlan Aaron R AR  

Bioinformatics (Oxford, England) 20121104 1


<h4>Motivation</h4>The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequen  ...[more]

Similar Datasets

| S-EPMC5963474 | biostudies-other
| S-EPMC3734359 | biostudies-other
| S-EPMC8648127 | biostudies-literature
| S-EPMC6901075 | biostudies-literature
| S-EPMC6030823 | biostudies-literature
| S-EPMC4721873 | biostudies-literature
| S-EPMC2947915 | biostudies-literature
| S-EPMC6511857 | biostudies-literature
| S-EPMC4986259 | biostudies-literature