Unknown

Dataset Information

0

Tangent normalization for somatic copy-number inference in cancer genome analysis.


ABSTRACT:

Motivation

Somatic copy-number alterations (SCNAs) play an important role in cancer development. Systematic noise in sequencing and array data present a significant challenge to the inference of SCNAs for cancer genome analyses. As part of The Cancer Genome Atlas, the Broad Institute Genome Characterization Center developed the Tangent normalization method to generate copy-number profiles using data from single-nucleotide polymorphism (SNP) arrays and whole-exome sequencing (WES) technologies for over 10 000 pairs of tumors and matched normal samples. Here, we describe the Tangent method, which uses a unique linear combination of normal samples as a reference for each tumor sample, to subtract systematic errors that vary across samples. We also describe a modification of Tangent, called Pseudo-Tangent, which enables denoising through comparisons between tumor profiles when few normal samples are available.

Results

Tangent normalization substantially increases signal-to-noise ratios (SNRs) compared to conventional normalization methods in both SNP array and WES analyses. Tangent and Pseudo-Tangent normalizations improve the SNR by reducing noise with minimal effect on signal and exceed the contribution of other steps in the analysis such as choice of segmentation algorithm. Tangent and Pseudo-Tangent are broadly applicable and enable more accurate inference of SCNAs from DNA sequencing and array data.

Availability and implementation

Tangent is available at https://github.com/broadinstitute/tangent and as a Docker image (https://hub.docker.com/r/broadinstitute/tangent). Tangent is also the normalization method for the copy-number pipeline in Genome Analysis Toolkit 4 (GATK4).

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Gao GF 

PROVIDER: S-EPMC9563697 | biostudies-literature | 2022 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications


<h4>Motivation</h4>Somatic copy-number alterations (SCNAs) play an important role in cancer development. Systematic noise in sequencing and array data present a significant challenge to the inference of SCNAs for cancer genome analyses. As part of The Cancer Genome Atlas, the Broad Institute Genome Characterization Center developed the Tangent normalization method to generate copy-number profiles using data from single-nucleotide polymorphism (SNP) arrays and whole-exome sequencing (WES) technol  ...[more]

Similar Datasets

| S-EPMC6662281 | biostudies-literature
| S-ECPF-GEOD-23056 | biostudies-other
| S-EPMC4563570 | biostudies-literature
| S-ECPF-GEOD-37382 | biostudies-other
| S-EPMC3481445 | biostudies-literature
| S-EPMC3966983 | biostudies-literature
| S-ECPF-GEOD-22932 | biostudies-other
| S-EPMC6178888 | biostudies-literature
| S-EPMC3827077 | biostudies-literature
| S-EPMC8355892 | biostudies-literature