Dataset Information

Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data.

ABSTRACT:

Background

The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly assessed. To this aim, a comprehensive model that integrates the factors of normal cell contamination and intra-tumour heterogeneity and that can be translated to synthetic data on which to perform benchmarks is indispensable.

Results

We propose such model and implement it in an R package called CnaGen to synthetically generate a wide range of alterations under different normal cell contamination levels. Six recently published methods for CNA and loss of heterozygosity (LOH) detection on tumour samples were assessed on this synthetic data and on a dilution series of a breast cancer cell-line: ASCAT, GAP, GenoCNA, GPHMM, MixHMM and OncoSNP. We report the recall rates in terms of normal cell contamination levels and alteration characteristics: length, copy number and LOH state, as well as the false discovery rate distribution for each copy number under different normal cell contamination levels.Assessed methods are in general better at detecting alterations with low copy number and under a little normal cell contamination levels. All methods except GPHMM, which failed to recognize the alteration pattern in the cell-line samples, provided similar results for the synthetic and cell-line sample sets. MixHMM and GenoCNA are the poorliest performing methods, while GAP generally performed better. This supports the viability of approaches other than the common hidden Markov model (HMM)-based.

Conclusions

We devised and implemented a comprehensive model to generate data that simulate tumoural samples genotyped using SNP arrays. The validity of the model is supported by the similarity of the results obtained with synthetic and real data. Based on these results and on the software implementation of the methods, we recommend GAP for advanced users and GPHMM for a fully driven analysis.

SUBMITTER: Mosen-Ansorena D

PROVIDER: S-EPMC3472297 | biostudies-literature | 2012 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data.

Mosén-Ansorena David D Aransay Ana María AM Rodríguez-Ezpeleta Naiara N

BMC bioinformatics 20120807

<h4>Background</h4>The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly assessed. To this aim, a comprehensive model that integrates the factors of normal cell contamination and intra-tumour heterogeneity and that can be translated to synthetic dat ...[more]

PMID: 22870940

Dataset Information

Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data.

Background

Results

Conclusions

Publications

Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data.
| S-EPMC1874617 | biostudies-literature

WISExome: a within-sample comparison approach to detect copy number variations in whole exome sequencing data.
| S-EPMC5865163 | biostudies-literature

Copy number variation detection and genotyping from exome sequence data.
| S-EPMC3409265 | biostudies-literature

Segregation distortion: Utilizing simulated genotyping data to evaluate statistical methods.
| S-EPMC7029859 | biostudies-literature

Estimation of copy number alterations from exome sequencing data.
| S-EPMC3526607 | biostudies-literature

Bayesian model to detect phenotype-specific genes for copy number data.
| S-EPMC3576305 | biostudies-literature

Copy number variation genotyping using family information.
| S-EPMC3668900 | biostudies-other

Comparative analysis of methods for identifying recurrent copy number alterations in cancer.
| S-EPMC3527554 | biostudies-literature

BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations.
| S-EPMC4003822 | biostudies-literature

CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data.
| S-EPMC9873172 | biostudies-literature