ABSTRACT: Alongside various clinical prognostic factors for diffuse large B-cell lymphoma (DLBCL) such as the international prognostic index (IPI) components (ie, age, tumor stage, performance status, serum lactate dehydrogenase concentration, and number of extranodal sites), prognostic gene signatures have recently shown promising efficacy. However, previously developed signatures for DLBCL suffer from many major inadequacies such as lack of reproducibility in external datasets, high number of members (genes) in a signature, and inconsistent association with the survival time in various datasets. Accordingly, we sought to find a reproducible prognostic gene signature with a minimal number of genes. Seven datasets-namely GSE10856 (420 samples), GSE31312 (470 samples), GSE69051 (157 samples), GSE32918 (172 samples), GSE4475 (123 samples), GSE11318 (203 samples), and GSE34171 (91 samples)-were employed. The datasets were randomly categorized into training (1219 samples comprising GSE10856, GSE31312, GSE69051, and GSE32918) and validation (417 samples consisting of GSE4475, GSE11318, and GSE34171) groups. Through the univariate Cox proportional hazards analysis, common genes associated with the overall survival time with a P value less than 0.001 and a false discovery rate less than 5% were identified in 1219 patients included in the 4 training datasets. Thereafter, the common genes were entered into a multivariate Cox proportional hazards analysis encompassing the common genes and the international prognostic index (IPI) factors as covariates, and then only common genes with a significant level of difference (P?2 or <-2) were selected to reconstruct the prognostic signature. After the analyses, a 7-gene prognostic signature was developed, which efficiently predicted the survival time in the training dataset (Ps?