Unknown

Dataset Information

0

Conpair: concordance and contamination estimator for matched tumor-normal pairs.


ABSTRACT:

Motivation

Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrectly interpreted as somatic. There are currently no sequence-only based methods that reliably estimate contamination levels in tumor samples, which frequently display copy number changes. As a solution, we developed Conpair, a tool for detection of sample swaps and cross-individual contamination in whole-genome and whole-exome tumor-normal sequencing experiments.

Results

On a ladder of in silico contaminated samples, we demonstrated that Conpair reliably measures contamination levels as low as 0.1%, even in presence of copy number changes. We also estimated contamination levels in glioblastoma WGS and WXS tumor-normal datasets from TCGA and showed that they strongly correlate with tumor-normal concordance, as well as with the number of germline variants called as somatic by several widely-used somatic callers.

Availability and implementation

The method is available at: https://github.com/nygenome/conpair CONTACT: egrabowska@gmail.com or mczody@nygenome.orgSupplementary information: Supplementary data are available at Bioinformatics online.

SUBMITTER: Bergmann EA 

PROVIDER: S-EPMC5048070 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Conpair: concordance and contamination estimator for matched tumor-normal pairs.

Bergmann Ewa A EA   Chen Bo-Juen BJ   Arora Kanika K   Vacic Vladimir V   Zody Michael C MC  

Bioinformatics (Oxford, England) 20160626 20


<h4>Motivation</h4>Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrectly interpreted as somatic. There are currently no sequence-only based methods that reliably estimate contamination levels in tumor samples, which frequently display copy number changes. As a so  ...[more]

Similar Datasets

| S-EPMC3060821 | biostudies-literature
| S-DIXA-D-1022 | biostudies-other
| S-EPMC6528031 | biostudies-literature
2009-01-23 | GSE14353 | GEO
2009-01-22 | E-GEOD-14353 | biostudies-arrayexpress
2022-12-16 | GSE142279 | GEO
| S-EPMC6440274 | biostudies-literature
2017-08-22 | GSE102101 | GEO
| S-EPMC5584341 | biostudies-literature
2019-02-20 | GSE110907 | GEO