Dataset Information

V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data.

ABSTRACT:

Motivation

High-throughput sequencing technologies are used increasingly, not only in viral genomics research but also in clinical surveillance and diagnostics. These technologies facilitate the assessment of the genetic diversity in intra-host virus populations, which affects transmission, virulence, and pathogenesis of viral infections. However, there are two major challenges in analysing viral diversity. First, amplification and sequencing errors confound the identification of true biological variants, and second, the large data volumes represent computational limitations.

Results

To support viral high-throughput sequencing studies, we developed V-pipe, a bioinformatics pipeline combining various state-of-the-art statistical models and computational tools for automated end-to-end analyses of raw sequencing reads. V-pipe supports quality control, read mapping and alignment, low-frequency mutation calling, and inference of viral haplotypes. For generating high-quality read alignments, we developed a novel method, called ngshmmalign, based on profile hidden Markov models and tailored to small and highly diverse viral genomes. V-pipe also includes benchmarking functionality providing a standardized environment for comparative evaluations of different pipeline configurations. We demonstrate this capability by assessing the impact of three different read aligners (Bowtie 2, BWA MEM, ngshmmalign) and two different variant callers (LoFreq, ShoRAH) on the performance of calling single-nucleotide variants in intra-host virus populations. V-pipe supports various pipeline configurations and is implemented in a modular fashion to facilitate adaptations to the continuously changing technology landscape.

Availability

V-pipe is freely available at https://github.com/cbg-ethz/V-pipe.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Posada-Cespedes S

PROVIDER: S-EPMC8289377 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Dataset Information

V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data.

Motivation

Results

Availability

Supplementary information

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

eccDNA-pipe: an integrated pipeline for identification, analysis and visualization of extrachromosomal circular DNA from high-throughput sequencing data.
| S-EPMC10862650 | biostudies-literature

R-SAP: a multi-threading computational pipeline for the characterization of high-throughput RNA-sequencing data.
| S-EPMC3351179 | biostudies-literature

HaTSPiL: A modular pipeline for high-throughput sequencing data analysis.
| S-EPMC6793853 | biostudies-literature

iMir: An integrated pipeline for high-throughput miRNA-Seq data analysis
| S-ECPF-GEOD-40617 | biostudies-other

hipFG: high-throughput harmonization and integration pipeline for functional genomics data.
| S-EPMC10660288 | biostudies-literature

PhyloHerb: A high-throughput phylogenomic pipeline for processing genome skimming data.
| S-EPMC9215275 | biostudies-literature

GENE-counter, a computational and statistical pipeline for assessing RNA-Seq data for genome-wide expression differences
2010-12-03 | GSE25818 | GEO

A computational pipeline for high- throughput discovery of cis-regulatory noncoding RNA in prokaryotes.
| S-EPMC1913097 | biostudies-literature

HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data.
| S-EPMC4152589 | biostudies-literature

BIPES, a cost-effective high-throughput method for assessing microbial diversity.
| S-EPMC3105743 | biostudies-literature