Unknown

Dataset Information

0

Detecting Alu insertions from high-throughput sequencing data.


ABSTRACT: High-throughput sequencing technologies have allowed for the cataloguing of variation in personal human genomes. In this manuscript, we present alu-detect, a tool that combines read-pair and split-read information to detect novel Alus and their precise breakpoints directly from either whole-genome or whole-exome sequencing data while also identifying insertions directly in the vicinity of existing Alus. To set the parameters of our method, we use simulation of a faux reference, which allows us to compute the precision and recall of various parameter settings using real sequencing data. Applying our method to 100 bp paired Illumina data from seven individuals, including two trios, we detected on average 1519 novel Alus per sample. Based on the faux-reference simulation, we estimate that our method has 97% precision and 85% recall. We identify 808 novel Alus not previously described in other studies. We also demonstrate the use of alu-detect to study the local sequence and global location preferences for novel Alu insertions.

SUBMITTER: David M 

PROVIDER: S-EPMC3783187 | biostudies-literature | 2013 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Detecting Alu insertions from high-throughput sequencing data.

David Matei M   Mustafa Harun H   Brudno Michael M  

Nucleic acids research 20130805 17


High-throughput sequencing technologies have allowed for the cataloguing of variation in personal human genomes. In this manuscript, we present alu-detect, a tool that combines read-pair and split-read information to detect novel Alus and their precise breakpoints directly from either whole-genome or whole-exome sequencing data while also identifying insertions directly in the vicinity of existing Alus. To set the parameters of our method, we use simulation of a faux reference, which allows us t  ...[more]

Similar Datasets

| S-EPMC5663213 | biostudies-literature
| S-EPMC7598922 | biostudies-literature
| S-EPMC8480091 | biostudies-literature
| S-EPMC3105418 | biostudies-literature
| S-EPMC5555480 | biostudies-literature
2021-02-02 | GSE114896 | GEO
| S-EPMC3832420 | biostudies-literature
| S-EPMC3458526 | biostudies-other
| S-EPMC4517485 | biostudies-literature
| S-EPMC2917713 | biostudies-literature