Dataset Information

SNVMix: predicting single nucleotide variants from next-generation sequencing of tumors.

ABSTRACT: Next-generation sequencing (NGS) has enabled whole genome and transcriptome single nucleotide variant (SNV) discovery in cancer. NGS produces millions of short sequence reads that, once aligned to a reference genome sequence, can be interpreted for the presence of SNVs. Although tools exist for SNV discovery from NGS data, none are specifically suited to work with data from tumors, where altered ploidy and tumor cellularity impact the statistical expectations of SNV discovery.We developed three implementations of a probabilistic Binomial mixture model, called SNVMix, designed to infer SNVs from NGS data from tumors to address this problem. The first models allelic counts as observations and infers SNVs and model parameters using an expectation maximization (EM) algorithm and is therefore capable of adjusting to deviation of allelic frequencies inherent in genomically unstable tumor genomes. The second models nucleotide and mapping qualities of the reads by probabilistically weighting the contribution of a read/nucleotide to the inference of a SNV based on the confidence we have in the base call and the read alignment. The third combines filtering out low-quality data in addition to probabilistic weighting of the qualities. We quantitatively evaluated these approaches on 16 ovarian cancer RNASeq datasets with matched genotyping arrays and a human breast cancer genome sequenced to >40x (haploid) coverage with ground truth data and show systematically that the SNVMix models outperform competing approaches.Software and data are available at http://compbio.bccrc.casshah@bccrc.ca SUPPLEMANTARY INFORMATION: Supplementary data are available at Bioinformatics online.

SUBMITTER: Goya R

PROVIDER: S-EPMC2832826 | biostudies-other | 2010 Mar

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

SNVMix: predicting single nucleotide variants from next-generation sequencing of tumors.

Goya Rodrigo R Sun Mark G F MG Morin Ryan D RD Leung Gillian G Ha Gavin G Wiegand Kimberley C KC Senz Janine J Crisan Anamaria A Marra Marco A MA Hirst Martin M Huntsman David D Murphy Kevin P KP Aparicio Sam S Shah Sohrab P SP

Bioinformatics (Oxford, England) 20100203 6

<h4>Motivation</h4>Next-generation sequencing (NGS) has enabled whole genome and transcriptome single nucleotide variant (SNV) discovery in cancer. NGS produces millions of short sequence reads that, once aligned to a reference genome sequence, can be interpreted for the presence of SNVs. Although tools exist for SNV discovery from NGS data, none are specifically suited to work with data from tumors, where altered ploidy and tumor cellularity impact the statistical expectations of SNV discovery. ...[more]

PMID: 20130035

Dataset Information

SNVMix: predicting single nucleotide variants from next-generation sequencing of tumors.

Publications

SNVMix: predicting single nucleotide variants from next-generation sequencing of tumors.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

SNVHMM: predicting single nucleotide variants from next generation sequencing.
| S-EPMC3718670 | biostudies-literature

An optimized targeted Next-Generation Sequencing approach for sensitive detection of single nucleotide variants.
| S-EPMC5766748 | biostudies-other

Simultaneous identification of clinically relevant single nucleotide variants, copy number alterations and gene fusions in solid tumors by targeted next-generation sequencing.
| S-EPMC5978263 | biostudies-literature

Comprehensive analysis to improve the validation rate for single nucleotide variants detected by next-generation sequencing.
| S-EPMC3906084 | biostudies-literature

From next-generation sequencing alignments to accurate comparison and validation of single-nucleotide variants: the pibase software.
| S-EPMC3592472 | biostudies-literature

Next-Generation Sequencing and In Vitro Expression Study of ADAMTS13 Single Nucleotide Variants in Deep Vein Thrombosis.
| S-EPMC5089687 | biostudies-literature

Next-generation sequencing for the diagnosis of MYH9-RD: Predicting pathogenic variants.
| S-EPMC6972977 | biostudies-literature

?IIb?3 variants defined by next-generation sequencing: predicting variants likely to cause Glanzmann thrombasthenia.
| S-EPMC4403182 | biostudies-literature

Molecular profiling of metastatic colorectal tumors using next-generation sequencing: a single-institution experience.
| S-EPMC5522060 | biostudies-other

Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome.
| S-EPMC4190874 | biostudies-other