Unknown

Dataset Information

0

Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants.


ABSTRACT: Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline CNVs accurately. Analysis of simulation data showed that this method outperforms existing methods. We apply this algorithm to low coverage whole-genome sequencing data from peripheral blood of nearly a thousand patients across eleven cancer types in The Cancer Genome Atlas (TCGA) to identify cancer-predisposing CNV regions. We confirm known regions and discover new ones including those covering KMT2C, GOLPH3, ERBB2 and PLAG1 Analysis of colorectal cancer genomes in particular reveals novel recurrent CNVs including deletions at two chromatin-remodeling genes RERE and NPM2 This method will be useful to many researchers interested in profiling CNVs from whole-genome sequencing data.

SUBMITTER: Xi R 

PROVIDER: S-EPMC5772337 | biostudies-literature | 2016 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants.

Xi Ruibin R   Lee Semin S   Xia Yuchao Y   Kim Tae-Min TM   Park Peter J PJ  

Nucleic acids research 20160603 13


Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline CNVs accurately. Analysis of simulation data showed that this method outperforms existing  ...[more]

Similar Datasets

| S-EPMC5175347 | biostudies-literature
| S-EPMC8164248 | biostudies-literature
| S-EPMC4053953 | biostudies-literature
| S-EPMC4081054 | biostudies-literature
| S-EPMC10762021 | biostudies-literature
| S-EPMC6126229 | biostudies-literature
| S-EPMC4330915 | biostudies-literature
| S-EPMC7659224 | biostudies-literature
| S-EPMC7604644 | biostudies-literature
| S-EPMC3219132 | biostudies-literature