Dataset Information

BOAT: Basic Oligonucleotide Alignment Tool.

ABSTRACT:

Background

Next-generation DNA sequencing technologies generate tens of millions of sequencing reads in one run. These technologies are now widely used in biology research such as in genome-wide identification of polymorphisms, transcription factor binding sites, methylation states, and transcript expression profiles. Mapping the sequencing reads to reference genomes efficiently and effectively is one of the most critical analysis tasks. Although several tools have been developed, their performance suffers when both multiple substitutions and insertions/deletions (indels) occur together.

Results

We report a new algorithm, Basic Oligonucleotide Alignment Tool (BOAT) that can accurately and efficiently map sequencing reads back to the reference genome. BOAT can handle several substitutions and indels simultaneously, a useful feature for identifying SNPs and other genomic structural variations in functional genomic studies. For better handling of low-quality reads, BOAT supports a "3'-end Trimming Mode" to build local optimized alignment for sequencing reads, further improving sensitivity. BOAT calculates an E-value for each hit as a quality assessment and provides customizable post-mapping filters for further mapping quality control.

Conclusion

Evaluations on both real and simulation datasets suggest that BOAT is capable of mapping large volumes of short reads to reference sequences with better sensitivity and lower memory requirement than other currently existing algorithms. The source code and pre-compiled binary packages of BOAT are publicly available for download at http://boat.cbi.pku.edu.cn under GNU Public License (GPL). BOAT can be a useful new tool for functional genomics studies.

SUBMITTER: Zhao SQ

PROVIDER: S-EPMC2788372 | biostudies-literature | 2009 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

BOAT: Basic Oligonucleotide Alignment Tool.

Zhao Shu-Qi SQ Wang Jun J Zhang Li L Li Jiong-Tang JT Gu Xiaocheng X Gao Ge G Wei Liping L

BMC genomics 20091203

<h4>Background</h4>Next-generation DNA sequencing technologies generate tens of millions of sequencing reads in one run. These technologies are now widely used in biology research such as in genome-wide identification of polymorphisms, transcription factor binding sites, methylation states, and transcript expression profiles. Mapping the sequencing reads to reference genomes efficiently and effectively is one of the most critical analysis tasks. Although several tools have been developed, their ...[more]

PMID: 19958483

Similar Datasets

Project description:BackgroundBLAST is one of the most common and useful tools for Genetic Research. This paper describes a software application we have termed Windows .NET Distributed Basic Local Alignment Search Toolkit (W.ND-BLAST), which enhances the BLAST utility by improving usability, fault recovery, and scalability in a Windows desktop environment. Our goal was to develop an easy to use, fault tolerant, high-throughput BLAST solution that incorporates a comprehensive BLAST result viewer with curation and annotation functionality.ResultsW.ND-BLAST is a comprehensive Windows-based software toolkit that targets researchers, including those with minimal computer skills, and provides the ability increase the performance of BLAST by distributing BLAST queries to any number of Windows based machines across local area networks (LAN). W.ND-BLAST provides intuitive Graphic User Interfaces (GUI) for BLAST database creation, BLAST execution, BLAST output evaluation and BLAST result exportation. This software also provides several layers of fault tolerance and fault recovery to prevent loss of data if nodes or master machines fail. This paper lays out the functionality of W.ND-BLAST. W.ND-BLAST displays close to 100% performance efficiency when distributing tasks to 12 remote computers of the same performance class. A high throughput BLAST job which took 662.68 minutes (11 hours) on one average machine was completed in 44.97 minutes when distributed to 17 nodes, which included lower performance class machines. Finally, there is a comprehensive high-throughput BLAST Output Viewer (BOV) and Annotation Engine components, which provides comprehensive exportation of BLAST hits to text files, annotated fasta files, tables, or association files.ConclusionW.ND-BLAST provides an interactive tool that allows scientists to easily utilizing their available computing resources for high throughput and comprehensive sequence analyses. The install package for W.ND-BLAST is freely downloadable from http://liru.ars.usda.gov/mainbioinformatics.html. With registration the software is free, installation, networking, and usage instructions are provided as well as a support forum.

Project description:Seagrass meadows commonly reside in shallow sheltered embayments typical of the locations that provide an attractive option for mooring boats. Given the potential for boat moorings to result in disturbance to the seabed due to repeated physical impact, these moorings may present a significant threat to seagrass meadows. The seagrass Zostera marina (known as eelgrass) is extensive across the northern hemisphere, forming critical fisheries habitat and creating efficient long-term stores of carbon in sediments. Although boat moorings have been documented to impact seagrasses, studies to date have been conducted on the slow growing Posidonia species' rather than the fast growing and rapidly reproducing Z. marina that may have a higher capacity to resist and recover from repeated disturbance. In the present study we examine swinging chain boat moorings in seagrass meadows across a range of sites in the United Kingdom to determine whether such moorings have a negative impact on the seagrass Zostera marina at the local and meadow scale. We provide conclusive evidence from multiple sites that Z. marina is damaged by swinging chain moorings leading to a loss of at least 6 ha of United Kingdom seagrass. Each swinging chain mooring was found to result in the loss of 122 m2 of seagrass. Loss is restricted to the area surrounding the mooring and the impact does not appear to translate to a meadow scale. This loss of United Kingdom seagrass from boat moorings is small but significant at a local scale. This is because it fragments existing meadows and ultimately reduces their resilience to other stressors. Boat moorings are prevalent in seagrass globally and it is likely this impairs their ecosystem functioning. Given the extensive ecosystem service value of seagrasses in terms of factors such as carbon storage and fish habitat such loss is of cause for concern. This indicates the need for the widespread use of seagrass friendly mooring systems in and around seagrass.

Dataset Information

BOAT: Basic Oligonucleotide Alignment Tool.

Background

Results

Conclusion

Publications

BOAT: Basic Oligonucleotide Alignment Tool.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets