ABSTRACT: Importance:Health care-associated infections (HAIs) are preventable, harmful, and costly; however, few resources are dedicated to infection surveillance of nonsurgical procedures, particularly cardiovascular implantable electronic device (CIED) procedures. Objective:To develop a method that includes text mining of electronic clinical notes to reliably and efficiently measure HAIs for CIED procedures. Design, Setting, and Participants:In this multicenter, national cohort study using electronic medical record data for patients undergoing CIED procedures in Veterans Health Administration (VA) facilities for fiscal years (FYs) 2016 and 2017, an algorithm to flag cases with a true CIED-related infection based on structured (eg, microbiology orders, vital signs) and free text diagnostic and therapeutic data (eg, procedure notes, discharge summaries, microbiology results) was developed and validated. Procedure data were divided into development and validation data sets. Criterion validity (ie, positive predictive validity [PPV], sensitivity, and specificity) was assessed via criterion-standard manual medical record review. Exposures:CIED procedure. Main Outcomes and Measures:The concordance between medical record review and the study algorithm with respect to the presence or absence of a CIED infection. CIED infection in the algorithm included 90-day mortality, congestive heart failure and nonmetastatic tumor comorbidities, CIED or surgical site infection International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes, antibiotic treatment of Staphylococci, a microbiology test of a cardiac specimen, and text documentation of infection in specific clinical notes (eg, cardiology, infectious diseases, inpatient discharge summaries). Results:The algorithm sample consisted of 19?212 CIED procedures; 15?077 patients (78.5%) were White individuals, 1487 (15.5%) were African American; 18?766 (97.7%) were men. The mean (SD) age in our sample was 71.8 (10.6) years. The infection detection threshold of predicted probability was set to greater than 0.10 and the algorithm flagged 276 of 9606 (2.9%) cases in the development data set (9606 procedures); PPV in this group was 41.4% (95% CI, 31.6%-51.8%). In the validation set (9606 procedures), at predicted probability 0.10 or more the algorithm PPV was 43.5% (95% CI, 37.1%-50.2%), and overall sensitivity and specificity were 94.4% (95% CI, 88.2%-97.9%) and 48.8% (95% CI, 42.6%-55.1%), respectively. Conclusions and Relevance:The findings of this study suggest that the method of combining structured and text data in VA electronic medical records can be used to expand infection surveillance beyond traditional boundaries to include outpatient and procedural areas.