ABSTRACT: OBJECTIVE:This study aimed to develop a gene-expression signature for identification of lymph node (LN) metastasis in esophageal squamous cell carcinoma (ESCC) patients. SUMMARY OF BACKGROUND DATA:LN metastasis is recognized as the most important independent risk factor for therapeutic decision-making of ESCC patients. METHODS:A bioinformatic approach was used to analyze RNA sequencing profiles of ESCC patients, and to develop a gene-expression signature for identifying LN metastasis. The robustness of this panel was assessed in 2 independent patient cohorts (n = 56 and 224). RESULTS:We initially prioritized a 16-gene signature out of the total 20,531 mRNAs. The model estimated by these 16 genes discriminated LN status with an area under the curve (AUC) of 0.77 [95% confidence interval (95% CI), 0.68-0.87, 5-fold cross-validation]. Subsequently, a reduced and optimized 5-gene panel was trained in a clinical cohort, which effectively distinguished ESCC patients with LN metastasis (cohort-1: AUC, 0.74; 95% CI, 0.58-0.89; cohort-2, T1-T2: AUC, 0.74; 95% CI, 0.63-0.86), and was significantly superior to preoperative computed tomography (AUC, 0.61; 95% CI, 0.50-0.72). Furthermore, a combination signature comprising of the 5-gene panel together with the lymphatic vessel invasion (LVI) and venous invasion (VI) demonstrated a significantly improved diagnostic performance compared with individual clinical variables, in both cohorts (cohort-1: AUC, 0.87; 95% CI, 0.78-0.96; cohort-2: AUC, 0.76; 95% CI, 0.65-0.88). CONCLUSION:Our novel 5-gene panel is a robust diagnostic tool for LN metastasis, especially in early-T stage ESCC patients, with a promising clinical potential.