Project description:Transcription profiling by NanoString nCounter of primary breast tumors from 1219 patients from the Carolina Breast Cancer Study (CBCS) using the NanoString nCounter platform and normalized with NanoString nSolver software. The NanoString RNA counting assay for formalin-fixed paraffin embedded samples is unique in its sensitivity, technical reproducibility, and robustness for analysis of clinical and archival samples. While commercial normalization methods are provided by NanoString, they are not optimal for all settings, particularly when samples exhibit strong technical or biological variation or where housekeeping genes have variable performance across the cohort. Here, we develop and evaluate a more comprehensive normalization procedure for NanoString data with steps for quality control, selection of housekeeping targets, normalization, and iterative data visualization and biological validation. The approach was evaluated using a large cohort from the Carolina Breast Cancer Study. The iterative process developed here eliminates technical variation more reliably than the NanoString commercial package, without diminishing biological variation, especially in long-term longitudinal multi-phase or multi-site cohorts. We also find that probe sets validated for nCounter, such as the PAM50 gene signature, are impervious to batch issues. This work emphasizes that preprocessing of gene expression data is an important component of study design. The normalized data here is processed through the RUVSeq-based iterative framework
Project description:Transcription profiling by NanoString nCounter of primary breast tumors from 1219 patients from the Carolina Breast Cancer Study (CBCS) using the NanoString nCounter platform and normalized with a RUVSeq-based iterative framework. The NanoString RNA counting assay for formalin-fixed paraffin embedded samples is unique in its sensitivity, technical reproducibility, and robustness for analysis of clinical and archival samples. While commercial normalization methods are provided by NanoString, they are not optimal for all settings, particularly when samples exhibit strong technical or biological variation or where housekeeping genes have variable performance across the cohort. Here, we develop and evaluate a more comprehensive normalization procedure for NanoString data with steps for quality control, selection of housekeeping targets, normalization, and iterative data visualization and biological validation. The approach was evaluated using a large cohort from the Carolina Breast Cancer Study. The iterative process developed here eliminates technical variation more reliably than the NanoString commercial package, without diminishing biological variation, especially in long-term longitudinal multi-phase or multi-site cohorts. We also find that probe sets validated for nCounter, such as the PAM50 gene signature, are impervious to batch issues. This work emphasizes that preprocessing of gene expression data is an important component of study design. The normalized data here is processed through the RUVSeq-based iterative framework
Project description:A microarray targeting promoters of cancer-related genes was used to evaluate DNA methylation at 935 CpG sites in 517 invasive breast tumors from the Carolina Breast Cancer Study (CBCS), a population-based study of invasive breast cancer. Concensus clustering using methylation (β) values for the 167 most variant CpG loci defined 4 clusters differing most distinctly in hormone receptor (HR) status, intrinsic subtype (luminal versus basal-like) and p53 mutation status. Supervised analyses for HR status, subtype, and p53 status identified differentially methylated CpG loci with considerable overlap (n=266). Concensus clustering also defined a hypermethylated luminal-enriched tumor cluster 3; gene ontology analysis of cluster 3 hypermethylated loci revealed enrichment for developmental genes, including homeobox domain genes (HOXB13, PAX6, IPF1, EYA4, DLK1, IHH, ISL1, TBX1, SOX1, SOX17). The hypermethylated luminal-enriched cluster 3 independently predicted poorer survival in multivariate Cox proportional hazard analysis, and this finding was confirmed in analysis of luminal A tumors. This study demonstrates epigenetic heterogeneity among breast tumors of a single intrinsic subtype, and shows that epigenetic patterns are strongly associated with HR status, subtype, and p53 mutation status. Among HR+ tumors, a gene signature characterized by hypermethylation of developmental genes may have prognostic value. Genes differentially methylated between clinically-important tumor subsets have roles in differentiation, development, and tumor growth and may be critical to inducing and maintaining tumor phenotypes and clinical outcomes. 517 breast tumors, 9 normal breast tissues
Project description:A microarray targeting promoters of cancer-related genes was used to evaluate DNA methylation at 935 CpG sites in 517 invasive breast tumors from the Carolina Breast Cancer Study (CBCS), a population-based study of invasive breast cancer. Concensus clustering using methylation (β) values for the 167 most variant CpG loci defined 4 clusters differing most distinctly in hormone receptor (HR) status, intrinsic subtype (luminal versus basal-like) and p53 mutation status. Supervised analyses for HR status, subtype, and p53 status identified differentially methylated CpG loci with considerable overlap (n=266). Concensus clustering also defined a hypermethylated luminal-enriched tumor cluster 3; gene ontology analysis of cluster 3 hypermethylated loci revealed enrichment for developmental genes, including homeobox domain genes (HOXB13, PAX6, IPF1, EYA4, DLK1, IHH, ISL1, TBX1, SOX1, SOX17). The hypermethylated luminal-enriched cluster 3 independently predicted poorer survival in multivariate Cox proportional hazard analysis, and this finding was confirmed in analysis of luminal A tumors. This study demonstrates epigenetic heterogeneity among breast tumors of a single intrinsic subtype, and shows that epigenetic patterns are strongly associated with HR status, subtype, and p53 mutation status. Among HR+ tumors, a gene signature characterized by hypermethylation of developmental genes may have prognostic value. Genes differentially methylated between clinically-important tumor subsets have roles in differentiation, development, and tumor growth and may be critical to inducing and maintaining tumor phenotypes and clinical outcomes.