VirGenA: a reference-based assembler for variable viral genomes.
Ontology highlight
ABSTRACT: Characterization of the within-host genetic diversity of viral pathogens is required for selection of effective treatment of some important viral infections, e.g. HIV, HBV and HCV. Despite the technical ability of detection, there are conflicting data regarding the clinical significance of low-frequency variants, partially because of the difficulty of their distinguishing from experimental artifacts. The issue of cross-contamination is relevant for all highly sensitive techniques, including deep sequencing: even trace contamination leads to a significant increase of false positives in identified SNVs. Determination of infections by multiple genotypes of some viruses, the incidence of which can be considerable, especially in risk groups, is also clinically significant in some cases. We developed a new viral reference-guided assembler, VirGenA, that can separate mixtures of strains of different intraspecies genetic groups (genotypes, subtypes, clades, etc.) and assemble a separate consensus sequence for each group in a mixture. It produced long assemblies for mixture components of extremely low frequencies (<1%) allowing detection of cross-contamination of samples by divergent genotypes. We tested VirGenA on both clinical and simulated data. On both types of data, VirGenA shows better or similar results than the existing de novo assemblers. Cross-platform implementation (including source code) is freely available at https://github.com/gFedonin/VirGenA/releases.
SUBMITTER: Fedonin GG
PROVIDER: S-EPMC6488938 | biostudies-literature | 2019 Jan
REPOSITORIES: biostudies-literature
ACCESS DATA