Project description:Copy number variants (CNVs) are currently defined as genomic sequences that are polymorphic in copy number and range in length from 1,000 to several million base pairs. Among current array-based CNV detection platforms, long-oligonucleotide arrays promise the highest resolution. However, the performance of currently available analytical tools suffers when applied to these data because of the lower signal:noise ratio inherent in oligonucleotide-based hybridization assays. We have developed wuHMM, an algorithm for mapping CNVs from array comparative genomic hybridization (aCGH) platforms comprised of 385,000 to more than 3 million probes. wuHMM is unique in that it can utilize sequence divergence information to reduce the false positive rate (FPR). We apply wuHMM to 385K-aCGH, 2.1M-aCGH, and 3.1M-aCGH experiments comparing the 129X1/SvJ and C57BL/6J inbred mouse genomes. We assess wuHMM’s performance on the 385K platform by comparison to the higher resolution platforms and we independently validate 10 CNVs. The method requires no training data and is robust with respect to changes in algorithm parameters. At a FPR of less than 10%, the algorithm can detect CNVs with five probes on the 385K platform and three on the 2.1M and 3.1M platforms, resulting in effective resolutions of 24 kb, 2-5 kb, and 1 kb, respectively. Keywords: CNV detection algorithm development and assessment
Project description:Copy number variants (CNVs) are currently defined as genomic sequences that are polymorphic in copy number and range in length from 1,000 to several million base pairs. Among current array-based CNV detection platforms, long-oligonucleotide arrays promise the highest resolution. However, the performance of currently available analytical tools suffers when applied to these data because of the lower signal:noise ratio inherent in oligonucleotide-based hybridization assays. We have developed wuHMM, an algorithm for mapping CNVs from array comparative genomic hybridization (aCGH) platforms comprised of 385,000 to more than 3 million probes. wuHMM is unique in that it can utilize sequence divergence information to reduce the false positive rate (FPR). We apply wuHMM to 385K-aCGH, 2.1M-aCGH, and 3.1M-aCGH experiments comparing the 129X1/SvJ and C57BL/6J inbred mouse genomes. We assess wuHMM’s performance on the 385K platform by comparison to the higher resolution platforms and we independently validate 10 CNVs. The method requires no training data and is robust with respect to changes in algorithm parameters. At a FPR of less than 10%, the algorithm can detect CNVs with five probes on the 385K platform and three on the 2.1M and 3.1M platforms, resulting in effective resolutions of 24 kb, 2-5 kb, and 1 kb, respectively. Keywords: CNV detection algorithm development and assessment All four samples in this series are hybridizations of genomic DNA from inbred mouse strains 129X1/SvJ versus C57BL6/J. The experiments were performed at increasing resolutions (one 385K, two 2.1M, and one 3.1M).
Project description:Clinical laboratories are adopting array comparative genomic hybridization (AGH) as a standard clinical test. A number of whole genome AGH systems are available, but little is known about the comparative performance in a clinical context. We prospectively studied 30 children with idiopathic MR and both unaffected parents of each child using Affymetrix 500K GeneChip SNP arrays, Agilent Human Genome 244K oligonucleotide arrays and NimbleGen 385K Whole-Genome oligonucleotide arrays. We determined whether CNVs called on these platforms were detected by Illumina Hap550 beadchips or SMRT 32K BAC whole genome tiling arrays and tested 15 of the 30 trios on Affymetrix 6.0 SNP array. The Affymetrix 500K, Agilent and NimbleGen platforms identified 3061 autosomal and 117 X chromosome CNVs in 30 trios. 147 of these CNVs were de novo, but only 33 (22%) of the de novo CNVs were found on more than one platform. Performing genotype-phenotype correlations, we identified 7 pathogenic and 4 possibly pathogenic CNVs for MR. All 11 of these CNVs were detected by both the Agilent and NimbleGen arrays, 9 by the Affymetrix 500K and Illumina beadchips, and 5 by the SMRT BAC array. Two of the 4 pathogenic or possibly pathogenic CNVs present in the trios tested with the Affymetrix 6.0 array were identified. Our findings demonstrate that different results are obtained with different AGH platforms and illustrate the trade-off that exists between sensitivity and specificity. The large number of apparently false positive CNV calls supports the need for validating clinically important findings with a different methodology.
Project description:Clinical laboratories are adopting array comparative genomic hybridization (AGH) as a standard clinical test. A number of whole genome AGH systems are available, but little is known about the comparative performance in a clinical context. We prospectively studied 30 children with idiopathic MR and both unaffected parents of each child using Affymetrix 500K GeneChip SNP arrays, Agilent Human Genome 244K oligonucleotide arrays and NimbleGen 385K Whole-Genome oligonucleotide arrays. We determined whether CNVs called on these platforms were detected by Illumina Hap550 beadchips or SMRT 32K BAC whole genome tiling arrays and tested 15 of the 30 trios on Affymetrix 6.0 SNP array. The Affymetrix 500K, Agilent and NimbleGen platforms identified 3061 autosomal and 117 X chromosome CNVs in 30 trios. 147 of these CNVs were de novo, but only 33 (22%) of the de novo CNVs were found on more than one platform. Performing genotype-phenotype correlations, we identified 7 pathogenic and 4 possibly pathogenic CNVs for MR. All 11 of these CNVs were detected by both the Agilent and NimbleGen arrays, 9 by the Affymetrix 500K and Illumina beadchips, and 5 by the SMRT BAC array. Two of the 4 pathogenic or possibly pathogenic CNVs present in the trios tested with the Affymetrix 6.0 array were identified. Our findings demonstrate that different results are obtained with different AGH platforms and illustrate the trade-off that exists between sensitivity and specificity. The large number of apparently false positive CNV calls supports the need for validating clinically important findings with a different methodology.
Project description:Clinical laboratories are adopting array comparative genomic hybridization (AGH) as a standard clinical test. A number of whole genome AGH systems are available, but little is known about the comparative performance in a clinical context. We prospectively studied 30 children with idiopathic MR and both unaffected parents of each child using Affymetrix 500K GeneChip SNP arrays, Agilent Human Genome 244K oligonucleotide arrays and NimbleGen 385K Whole-Genome oligonucleotide arrays. We determined whether CNVs called on these platforms were detected by Illumina Hap550 beadchips or SMRT 32K BAC whole genome tiling arrays and tested 15 of the 30 trios on Affymetrix 6.0 SNP array. The Affymetrix 500K, Agilent and NimbleGen platforms identified 3061 autosomal and 117 X chromosome CNVs in 30 trios. 147 of these CNVs were de novo, but only 33 (22%) of the de novo CNVs were found on more than one platform. Performing genotype-phenotype correlations, we identified 7 pathogenic and 4 possibly pathogenic CNVs for MR. All 11 of these CNVs were detected by both the Agilent and NimbleGen arrays, 9 by the Affymetrix 500K and Illumina beadchips, and 5 by the SMRT BAC array. Two of the 4 pathogenic or possibly pathogenic CNVs present in the trios tested with the Affymetrix 6.0 array were identified. Our findings demonstrate that different results are obtained with different AGH platforms and illustrate the trade-off that exists between sensitivity and specificity. The large number of apparently false positive CNV calls supports the need for validating clinically important findings with a different methodology.