Unknown

Dataset Information

0

Accurate and exact CNV identification from targeted high-throughput sequence data.


ABSTRACT:

Background

Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data.

Results

Sequencing data is first scanned for gains and losses using a comparison of normalized coverage data between samples. CNV calls are confirmed by testing for a signature of sequences that span the CNV breakpoint. With our method, CNVs can be identified regardless of whether breakpoints are within regions targeted for sequencing. For CNVs where at least one breakpoint is within targeted sequence, exact CNV breakpoints can be identified. In a test data set of 96 subjects sequenced across ~1 Mb genomic sequence using multiplexing technology, our method detected mutations as small as 31 bp, predicted quantitative copy count, and had a low false-positive rate.

Conclusions

Application of this method allows for identification of gains and losses in targeted sequence data, providing comprehensive mutation screening when combined with a short read aligner.

SUBMITTER: Nord AS 

PROVIDER: S-EPMC3088570 | biostudies-literature | 2011 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate and exact CNV identification from targeted high-throughput sequence data.

Nord Alex S AS   Lee Ming M   King Mary-Claire MC   Walsh Tom T  

BMC genomics 20110412


<h4>Background</h4>Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data.<h4>Results</h4>Sequencing data is first scanned for gains a  ...[more]

Similar Datasets

| S-EPMC4542776 | biostudies-literature
| S-EPMC3295828 | biostudies-literature
| S-EPMC3389763 | biostudies-literature
| S-EPMC2896179 | biostudies-literature
| S-EPMC2859315 | biostudies-literature
| S-EPMC8008629 | biostudies-literature
| S-EPMC3850689 | biostudies-literature
| S-EPMC6084352 | biostudies-literature
| S-EPMC3953903 | biostudies-literature
| S-EPMC3592458 | biostudies-other