Unknown

Dataset Information

0

Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness.


ABSTRACT:

Background

Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically related.

Results

On benchmark data sets, our estimation method provided accurate richness estimates (< 0.2 median estimation error) and improved the precision of ViQuaS by 2%-13% and F-score by 1%-9% without compromising the recall rates. We also demonstrate that our estimation method can be used to improve the precision and F-score of ShoRAH by 0%-7% and 0%-5% respectively.

Conclusions

The proposed probabilistic estimation method can be used to estimate the richness of viral populations with a quasispecies behavior and to improve the accuracy of the quasispecies spectra reconstructed by the existing methods ViQuaS and ShoRAH in the presence of a moderate level of technical sequencing errors.

Availability

http://sourceforge.net/projects/viquas/.

SUBMITTER: Jayasundara D 

PROVIDER: S-EPMC4682401 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness.

Jayasundara Duleepa D   Saeed I I   Chang B C BC   Tang Sen-Lin SL   Halgamuge Saman K SK  

BMC bioinformatics 20151209


<h4>Background</h4>Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically rel  ...[more]

Similar Datasets

| S-EPMC3194189 | biostudies-literature
| S-EPMC6355096 | biostudies-literature
| S-EPMC6022648 | biostudies-literature
| S-EPMC3372249 | biostudies-other
| S-EPMC6797082 | biostudies-literature
2021-04-23 | GSE145571 | GEO
| S-EPMC3869190 | biostudies-literature
| S-EPMC5118932 | biostudies-literature
| S-EPMC1569948 | biostudies-literature
| S-EPMC8604935 | biostudies-literature