Unknown

Dataset Information

0

CallSim: Evaluation of Base Calls Using Sequencing Simulation.


ABSTRACT: Accurate base calls generated from sequencing data are required for downstream biological interpretation, particularly in the case of rare variants. CallSim is a software application that provides evidence for the validity of base calls believed to be sequencing errors and it is applicable to Ion Torrent and 454 data. The algorithm processes a single read using a Monte Carlo approach to sequencing simulation, not dependent upon information from any other read in the data set. Three examples from general read correction, as well as from error-or-variant classification, demonstrate its effectiveness for a robust low-volume read processing base corrector. Specifically, correction of errors in Ion Torrent reads from a study involving mutations in multidrug resistant Staphylococcus aureus illustrates an ability to classify an erroneous homopolymer call. In addition, support for a rare variant in 454 data for a mixed viral population demonstrates "base rescue" capabilities. CallSim provides evidence regarding the validity of base calls in sequences produced by 454 or Ion Torrent systems and is intended for hands-on downstream processing analysis. These downstream efforts, although time consuming, are necessary steps for accurate identification of rare variants.

SUBMITTER: Morrow JD 

PROVIDER: S-EPMC4393072 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

altmetric image

Publications

CallSim: Evaluation of Base Calls Using Sequencing Simulation.

Morrow Jarrett D JD   Higgs Brandon W BW  

ISRN bioinformatics 20121212


Accurate base calls generated from sequencing data are required for downstream biological interpretation, particularly in the case of rare variants. CallSim is a software application that provides evidence for the validity of base calls believed to be sequencing errors and it is applicable to Ion Torrent and 454 data. The algorithm processes a single read using a Monte Carlo approach to sequencing simulation, not dependent upon information from any other read in the data set. Three examples from  ...[more]

Similar Datasets

2012-05-18 | GSE29550 | GEO
2012-05-17 | E-GEOD-29550 | biostudies-arrayexpress
| S-EPMC3208482 | biostudies-literature
| PRJEB34900 | ENA
| PRJEB32906 | ENA
| S-EPMC6990610 | biostudies-literature
| S-EPMC3963795 | biostudies-other
| S-EPMC6453757 | biostudies-literature
| S-EPMC2825269 | biostudies-literature
| S-EPMC8015711 | biostudies-literature