Unknown

Dataset Information

0

CnvOffSeq: detecting intergenic copy number variation using off-target exome sequencing data.


ABSTRACT: MOTIVATION: Exome sequencing technologies have transformed the field of Mendelian genetics and allowed for efficient detection of genomic variants in protein-coding regions. The target enrichment process that is intrinsic to exome sequencing is inherently imperfect, generating large amounts of unintended off-target sequence. Off-target data are characterized by very low and highly heterogeneous coverage and are usually discarded by exome analysis pipelines. We posit that off-target read depth is a rich, but overlooked, source of information that could be mined to detect intergenic copy number variation (CNV). We propose cnvOffseq, a novel normalization framework for off-target read depth that is based on local adaptive singular value decomposition (SVD). This method is designed to address the heterogeneity of the underlying data and allows for accurate and precise CNV detection and genotyping in off-target regions. RESULTS: cnvOffSeq was benchmarked on whole-exome sequencing samples from the 1000 Genomes Project. In a set of 104 gold standard intergenic deletions, our method achieved a sensitivity of 57.5% and a specificity of 99.2%, while maintaining a low FDR of 5%. For gold standard deletions longer than 5 kb, cnvOffSeq achieves a sensitivity of 90.4% without increasing the FDR. cnvOffSeq outperforms both whole-genome and whole-exome CNV detection methods considerably and is shown to offer a substantial improvement over naïve local SVD. AVAILABILITY AND IMPLEMENTATION: cnvOffSeq is available at http://sourceforge.net/p/cnvoffseq/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

SUBMITTER: Bellos E 

PROVIDER: S-EPMC4147927 | biostudies-literature | 2014 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

cnvOffSeq: detecting intergenic copy number variation using off-target exome sequencing data.

Bellos Evangelos E   Coin Lachlan J M LJ  

Bioinformatics (Oxford, England) 20140901 17


<h4>Motivation</h4>Exome sequencing technologies have transformed the field of Mendelian genetics and allowed for efficient detection of genomic variants in protein-coding regions. The target enrichment process that is intrinsic to exome sequencing is inherently imperfect, generating large amounts of unintended off-target sequence. Off-target data are characterized by very low and highly heterogeneous coverage and are usually discarded by exome analysis pipelines. We posit that off-target read d  ...[more]

Similar Datasets

| S-EPMC3549847 | biostudies-literature
| S-EPMC4053953 | biostudies-literature
| S-EPMC4081054 | biostudies-literature
| S-EPMC6836508 | biostudies-literature
| S-EPMC4155258 | biostudies-literature
| S-EPMC5452530 | biostudies-literature
| S-EPMC8406611 | biostudies-literature
| S-EPMC5175347 | biostudies-literature
| S-EPMC4630827 | biostudies-literature
| S-EPMC6829143 | biostudies-literature