Unknown

Dataset Information

0

An alignment algorithm for bisulfite sequencing using the Applied Biosystems SOLiD System.


ABSTRACT:

Summary

Bisulfite sequencing allows cytosine methylation, an important epigenetic marker, to be detected via nucleotide substitutions. Since the Applied Biosystems SOLiD System uses a unique di-base encoding that increases confidence in the detection of nucleotide substitutions, it is a potentially advantageous platform for this application. However, the di-base encoding also makes reads with many nucleotide substitutions difficult to align to a reference sequence with existing tools, preventing the platform's potential utility for bisulfite sequencing from being realized. Here, we present SOCS-B, a reference-based, un-gapped alignment algorithm for the SOLiD System that is tolerant of both bisulfite-induced nucleotide substitutions and a parametric number of sequencing errors, facilitating bisulfite sequencing on this platform. An implementation of the algorithm has been integrated with the previously reported SOCS alignment tool, and was used to align CpG methylation-enriched Arabidopsis thaliana bisulfite sequence data, exhibiting a 2-fold increase in sensitivity compared to existing methods for aligning SOLiD bisulfite data.

Availability

Executables, source code, and sample data are available at http://solidsoftwaretools.com/gf/project/socs/

SUBMITTER: Ondov BD 

PROVIDER: S-EPMC2905549 | biostudies-literature | 2010 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

An alignment algorithm for bisulfite sequencing using the Applied Biosystems SOLiD System.

Ondov Brian D BD   Cochran Charles C   Landers Mark M   Meredith Gavin D GD   Dudas Miroslav M   Bergman Nicholas H NH  

Bioinformatics (Oxford, England) 20100618 15


<h4>Summary</h4>Bisulfite sequencing allows cytosine methylation, an important epigenetic marker, to be detected via nucleotide substitutions. Since the Applied Biosystems SOLiD System uses a unique di-base encoding that increases confidence in the detection of nucleotide substitutions, it is a potentially advantageous platform for this application. However, the di-base encoding also makes reads with many nucleotide substitutions difficult to align to a reference sequence with existing tools, pr  ...[more]

Similar Datasets

2010-10-20 | GSE24012 | GEO
2010-10-19 | E-GEOD-24012 | biostudies-arrayexpress
| S-EPMC1764432 | biostudies-literature
| S-EPMC4481698 | biostudies-literature
| S-EPMC2639273 | biostudies-other
| S-EPMC4769831 | biostudies-literature
| S-EPMC3268241 | biostudies-literature
| S-EPMC446300 | biostudies-literature
| S-EPMC10477233 | biostudies-literature
| S-EPMC11237527 | biostudies-literature