Project description:Illumina Infinium whole genome genotyping (WGG) arrays are increasingly being applied in cancer genomics to study gene copy number alterations and allele-specific aberrations such as loss-of-heterozygosity (LOH). Methods developed for normalization of WGG arrays have mostly focused on diploid, normal samples. However, for cancer samples genomic aberrations may confound normalization and data interpretation. Therefore, we examined the effects of the conventionally used normalization method for Illumina Infinium arrays when applied to cancer samples. We demonstrate an asymmetry in the detection of the two alleles for each SNP, which deleteriously influences both allelic proportions and copy number estimates. The asymmetry is caused by a remaining bias between the two dyes used in the Infinium II assay after using the normalization method in Illumina’s proprietary software (BeadStudio). We propose a quantile normalization strategy for correction of this dye bias. We tested the normalization strategy using 535 individual hybridizations from 10 data sets from the analysis of cancer genomes and normal blood samples generated on Illumina Infinium II 300k version 1 and 2, 370k and 550k BeadChips. We show that the proposed normalization strategy successfully removes asymmetry in estimates of both allelic proportions and copy numbers. Additionally, the normalization strategy reduces the technical variation for copy number estimates while retaining the response to copy number alterations. The proposed normalization strategy represents a valuable low-level analysis tool that improves the quality of data obtained from Illumina Infinium arrays, in particular when used for LOH and copy number variation studies.
Project description:Illumina Infinium whole genome genotyping (WGG) arrays are increasingly being applied in cancer genomics to study gene copy number alterations and allele-specific aberrations such as loss-of-heterozygosity (LOH). Methods developed for normalization of WGG arrays have mostly focused on diploid, normal samples. However, for cancer samples genomic aberrations may confound normalization and data interpretation. Therefore, we examined the effects of the conventionally used normalization method for Illumina Infinium arrays when applied to cancer samples. We demonstrate an asymmetry in the detection of the two alleles for each SNP, which deleteriously influences both allelic proportions and copy number estimates. The asymmetry is caused by a remaining bias between the two dyes used in the Infinium II assay after using the normalization method in Illumina’s proprietary software (BeadStudio). We propose a quantile normalization strategy for correction of this dye bias. We tested the normalization strategy using 535 individual hybridizations from 10 data sets from the analysis of cancer genomes and normal blood samples generated on Illumina Infinium II 300k version 1 and 2, 370k and 550k BeadChips. We show that the proposed normalization strategy successfully removes asymmetry in estimates of both allelic proportions and copy numbers. Additionally, the normalization strategy reduces the technical variation for copy number estimates while retaining the response to copy number alterations. The proposed normalization strategy represents a valuable low-level analysis tool that improves the quality of data obtained from Illumina Infinium arrays, in particular when used for LOH and copy number variation studies. To investigate the effects of a quantile normalization of Illumina Infinium data, compared to conventional normalization using BeadStudio (www.illumina.com), we renormalized 535 individual hybridizations conducted on Illumina 300K, 370K and 550K BeadChips. Sample types included breast cancer, colon cancer, urothelial carcinoma, leukemia as well as normal blood and HapMap samples. This series includes the 6 breast cancers hybridized on Illumina HumanHap 550K BeadChips.
Project description:Copy number profiling of 36 ovarian tumors on Affymetrix 100K SNP arrays Thirty-six ovarian tumors were profiled for copy-number alterations with the Affymetrix 100K Mapping Array. Copy number profiling of 36 ovarian tumors on Affymetrix 500K SNP arrays Sixteen ovary tumors were profiled for copy-number alterations with the high-resolution Affymetrix 500K Mapping Array. Affymetrix 100K Mapping Array intensity signal CEL files were processed by dChip 2005 (Build date Nov 30, 2005) using the PM/MM difference model and invariant set normalization. Each probe set was mapped to the genome, NCBI assembly version 36, using annotation provided by the Affymetrix web site. The log2 ratios were centered to a median of zero and segmented using the GLAD package for the R statistical environment. Copy number was calculated as power(2,log2ratio + 1). Affymetrix 500K Mapping Array intensity signal CEL files were processed by dChip 2005 (Build date Nov 30, 2005) using the PM/MM difference model and invariant set normalization. Forty-eight normal samples were downloaded from the Affymetrix website (http://www.affymetrix.com/support/technical/byproduct.affx?product=500k) and analyzed at the same time. One CEL file for each set (Sty and Nsp) with the median signal intensity across the set was selected as the reference array. The dChip-normalized signal intensities were converted to log2 ratios and segmented as follows. For each autosomal probe set, the log2 tumor/normal ratio of each tumor sample was calculated using the average intensity for each probe set in the normal set. For Chromosome X, the average of the 20 normal female samples was used. Each probe set was mapped to the genome, NCBI assembly version 36, using annotation provided by the Affymetrix web site. The log2 ratios were centered to a median of zero and segmented using the GLAD package for the R statistical environment. Copy number was calculated as power(2,log2ratio + 1).
Project description:Development of a clinically relevant animal models of RCC for preclinical investigations. For DNA copy number analysis, the Sty I (250K) SNP array of the 500K Human Mapping Array (Affymetrix) was used. Arrays were scanned by GeneChip Scanner 3000 7G. Probe-level signal intensities were normalized to a baseline array with median intensity using invariant set normalization and SNP-level signal intensities were obtained using a model-based (PM/MM) method. Keywords: SNP array data, renal cell carcinoma
Project description:SNP arrays were used to derive copy number estimates and identify amplifications and deletions in melanomas These copy number breakpoints were compared to gene fusions identified by second generation sequencing of cDNA
Project description:We performed Illumina Infinium whole-genome SNP-CN profiling of KMS11, MM.1S, and RPMI8226 multiple myeloma cell lines to detect gene copy number variants distinct to each cell line
Project description:An eQTL analysis show that mutations in PDR8 gene in 59A strain versus S288c could trigger expressions variations of QDR2. In order to confirm this result and highlight other gene expression variations associated to PDR8 allelic variation, we performed an allele switch of PDR8 in 59A background (59A PDR8-S288c) and compared the transcriptomic profile of this strain to 59A. The analysis was performed in wine alcoholic fermentation conditions in stationary phase during nitrogen starvation and in alcoholic stress. PDR8 allelic variation: 9 non synonymous SNP (S288c->59A) 9(K->T), 17(S->L), 79(P->S), 198(G->D), 263(H->R), 267(T->S), 371(L->F), 550(A->G), 601(I->V). 6 transcriptomic profiles were performed with Agilent mono-color array: 3 hybridizations were performed for each strain corresponding to 3 biological replicated samples. Using mono-color array, the intensity was used after normalization to calculate logarithm base 2 of expression ratios.
Project description:We describe a method for automatic detection of absolute segmental copy numbers and genotype status in complex cancer genome profiles measured by SNP arrays. The method is based on pattern recognition of segmented and smoothed copy number and allelic imbalance profiles. Overall copy number assignments were verified by DNA indexes of breast carcinomas and karyotypes of cell lines. The method performs well even for poor quality data, low tumor content, and highly rearranged tumor genomes.