Unknown

Dataset Information

0

PhRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats.


ABSTRACT:

Motivation

Transposable elements (TEs) and repetitive DNA make up a sizable fraction of Eukaryotic genomes, and their annotation is crucial to the study of the structure, organization, and evolution of any newly sequenced genome. Although RepeatMasker and nHMMER are useful for identifying these repeats, they require a pre-compiled repeat library-which is not always available. De novo identification tools such as Recon, RepeatScout or RepeatGluer serve to identify TEs purely from sequence content, but are either limited by runtimes that prohibit whole-genome use or degrade in quality in the presence of substitutions that disrupt the sequence patterns.

Results

phRAIDER is a de novo TE identification tool that address the issues of excessive runtime without sacrificing sensitivity as compared to competing tools. The underlying model is a new definition of elementary repeats that incorporates the PatternHunter spaced seed model, allowing for greater sensitivity in the presence of genomic substitutions. As compared with the premier tool in the literature, RepeatScout, phRAIDER shows an average 10× speedup on any single human chromosome and has the ability to process the whole human genome in just over three hours. Here we discuss the tool, the theoretical model underlying the tool, and the results demonstrating its effectiveness.

Availability and implementation

phRAIDER is an open source tool available from https://github.com/karroje/phRAIDER CONTACT: : karroje@miamiOH.edu or

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Schaeffer CE 

PROVIDER: S-EPMC4908342 | biostudies-literature | 2016 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats.

Schaeffer Carly E CE   Figueroa Nathaniel D ND   Liu Xiaolin X   Karro John E JE  

Bioinformatics (Oxford, England) 20160601 12


<h4>Motivation</h4>Transposable elements (TEs) and repetitive DNA make up a sizable fraction of Eukaryotic genomes, and their annotation is crucial to the study of the structure, organization, and evolution of any newly sequenced genome. Although RepeatMasker and nHMMER are useful for identifying these repeats, they require a pre-compiled repeat library-which is not always available. De novo identification tools such as Recon, RepeatScout or RepeatGluer serve to identify TEs purely from sequence  ...[more]

Similar Datasets

| S-EPMC3402919 | biostudies-literature
| S-EPMC9248749 | biostudies-literature
| S-EPMC2664473 | biostudies-literature
| S-EPMC3530910 | biostudies-literature
| S-EPMC8197069 | biostudies-literature
| S-EPMC7674663 | biostudies-literature
| S-EPMC2238772 | biostudies-literature
| S-EPMC10998567 | biostudies-literature
| S-EPMC3373140 | biostudies-literature
| S-EPMC3268599 | biostudies-literature