Unknown

Dataset Information

0

High-resolution strain-level microbiome composition analysis from short reads.


ABSTRACT:

Background

Bacterial strains under the same species can exhibit different biological properties, making strain-level composition analysis an important step in understanding the dynamics of microbial communities. Metagenomic sequencing has become the major means for probing the microbial composition in host-associated or environmental samples. Although there are a plethora of composition analysis tools, they are not optimized to address the challenges in strain-level analysis: highly similar strain genomes and the presence of multiple strains under one species in a sample. Thus, this work aims to provide a high-resolution and more accurate strain-level analysis tool for short reads.

Results

In this work, we present a new strain-level composition analysis tool named StrainScan that employs a novel tree-based k-mers indexing structure to strike a balance between the strain identification accuracy and the computational complexity. We tested StrainScan extensively on a large number of simulated and real sequencing data and benchmarked StrainScan with popular strain-level analysis tools including Krakenuniq, StrainSeeker, Pathoscope2, Sigma, StrainGE, and StrainEst. The results show that StrainScan has higher accuracy and resolution than the state-of-the-art tools on strain-level composition analysis. It improves the F1 score by 20% in identifying multiple strains at the strain level.

Conclusions

By using a novel k-mer indexing structure, StrainScan is able to provide strain-level analysis with higher resolution than existing tools, enabling it to return more informative strain composition analysis in one sample or across multiple samples. StrainScan takes short reads and a set of reference strains as input and its source codes are freely available at https://github.com/liaoherui/StrainScan . Video Abstract.

SUBMITTER: Liao H 

PROVIDER: S-EPMC10433603 | biostudies-literature | 2023 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

High-resolution strain-level microbiome composition analysis from short reads.

Liao Herui H   Ji Yongxin Y   Sun Yanni Y  

Microbiome 20230817 1


<h4>Background</h4>Bacterial strains under the same species can exhibit different biological properties, making strain-level composition analysis an important step in understanding the dynamics of microbial communities. Metagenomic sequencing has become the major means for probing the microbial composition in host-associated or environmental samples. Although there are a plethora of composition analysis tools, they are not optimized to address the challenges in strain-level analysis: highly simi  ...[more]

Similar Datasets

| S-EPMC6169887 | biostudies-literature
| S-EPMC3905898 | biostudies-literature
| S-EPMC9508831 | biostudies-literature
| S-EPMC8131609 | biostudies-literature
| S-EPMC7850483 | biostudies-literature
| S-EPMC6956785 | biostudies-literature
| S-EPMC7168855 | biostudies-literature
| S-EPMC3689806 | biostudies-literature
| S-EPMC6624308 | biostudies-literature
| S-EPMC6153666 | biostudies-literature