Genomics

Dataset Information

0

A Cancer Cell-Line Titration Series for Evaluating Somatic Classification


ABSTRACT: Accurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies. Cell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300x. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1,500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease. Our cell-line titration series dataset, along with the associated verification results, is effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development.

PROVIDER: EGAS00001001016 | EGA |

REPOSITORIES: EGA

altmetric image

Publications

A cancer cell-line titration series for evaluating somatic classification.

Denroche Robert E RE   Mullen Laura L   Timms Lee L   Beck Timothy T   Yung Christina K CK   Stein Lincoln L   McPherson John D JD   Brown Andrew M K AM  

BMC research notes 20151226


<h4>Background</h4>Accurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somati  ...[more]

Similar Datasets

| PRJNA209118 | ENA
2024-08-05 | GSE213503 | GEO
2024-08-05 | GSE213338 | GEO
| EGAD00001000625 | EGA
| EGAS00001000433 | EGA
| PRJNA480852 | ENA
| PRJNA814051 | ENA
2016-07-22 | PXD002613 | Pride
2017-12-01 | GSE79989 | GEO
| PRJNA509008 | ENA