ABSTRACT: Co-expression modules are groups of genes with highly correlated expression patterns. In cancer, differences in module activity potentially represent the heterogeneity of phenotypes important in carcinogenesis, progression, or treatment response. To find gene expression modules active in breast cancer subpopulations, we assembled 72 breast cancer-related gene expression datasets containing ?5,700 samples altogether. Per dataset, we identified genes with bimodal expression and used mixture-model clustering to ultimately define 11 modules of genes that are consistently co-regulated across multiple datasets. Functionally, these modules reflected estrogen signaling, development/differentiation, immune signaling, histone modification, ERBB2 signaling, the extracellular matrix (ECM) and stroma, and cell proliferation. The Tcell/Bcell immune modules appeared tumor-extrinsic, with coherent expression in tumors but not cell lines; whereas most other modules, interferon and ECM included, appeared intrinsic. Only four of the eleven modules were represented in the PAM50 intrinsic subtype classifier and other well-established prognostic signatures; although the immune modules were highly correlated to previously published immune signatures. As expected, the proliferation module was highly associated with decreased recurrence-free survival (RFS). Interestingly, the immune modules appeared associated with RFS even after adjustment for receptor subtype and proliferation; and in a multivariate analysis, the combination of Tcell/Bcell immune module down-regulation and proliferation module upregulation strongly associated with decreased RFS. Immune modules are unusual in that their upregulation is associated with a good prognosis without chemotherapy and a good response to chemotherapy, suggesting the paradox of high immune patients who respond to chemotherapy but would do well without it. Other findings concern the ECM/stromal modules, which despite common themes were associated with different sites of metastasis, possibly relating to the "seed and soil" hypothesis of cancer dissemination. Overall, co-expression modules provide a high-level functional view of breast cancer that complements the "cancer hallmarks" and may form the basis for improved predictors and treatments.