Unknown

Dataset Information

0

Sigma: strain-level inference of genomes from metagenomic analysis for biosurveillance.


ABSTRACT:

Motivation

Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis.

Results

Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. The algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains.

Availability and implementation

Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Ahn TH 

PROVIDER: S-EPMC4287953 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5533426 | biostudies-other
| S-EPMC9580935 | biostudies-literature
| S-EPMC6169887 | biostudies-literature
| S-EPMC5541208 | biostudies-literature
| S-EPMC5314789 | biostudies-literature
| S-EPMC8388557 | biostudies-literature
| PRJEB20873 | ENA
| S-EPMC6624308 | biostudies-literature
| S-EPMC10516513 | biostudies-literature
| S-EPMC7284976 | biostudies-literature