ABSTRACT: Patient-reported outcomes data are needed to determine the efficacy of cosmetic procedures.To describe the development and psychometric evaluation of 8 appearance scales and 2 adverse effect checklists for use in minimally invasive cosmetic procedures.We performed a psychometric study to select the most clinically sensitive items for inclusion in item-reduced scales and to examine reliability and validity with patients. Recruitment of the sample for this study took place from June 6, 2010, through July 28, 2014. Data analysis was performed from December 11, 2014, to December 22, 2015. Pretreatment and posttreatment patients 18 years and older who were consulting for any type of facial aesthetic treatment were studied. Patients were from plastic surgery and dermatology outpatient clinics in the United States and Canada (field-test sample) and a clinical trial of a minimally invasive lip treatment in the United Kingdom and France (clinical trial sample).The FACE-Q scales that measure appearance of the skin, lips, and facial rhytids (ie, overall, forehead, glabella, lateral periorbital area, lips, and marionette lines), with scores ranging from 0 (lowest) to 100 (highest), and the FACE-Q adverse effects checklists for problems after skin and lip treatment.Of 783 patients recruited, 503 field-test patients (response rate, 90%) and 280 clinical trial participants were studied. The mean (SD) age of the patients was 47.4 (14.0) years in the field-test sample and 47.7 (12.3) years in the clinical trial sample. Most of the patients were female (429 [85.3%] in the field-test sample and 274 [97.9%] in the clinical trial sample). Rasch Measurement Theory analyses led to the refinement of 8 appearance scales with 66 total items. All FACE-Q scale items had ordered thresholds and acceptable item fit. Reliability, measured with the Personal Separation Index (range, 0.88-0.95) and Cronbach ? (range, 0.93-0.98), was high. Lower scores for appearance scales that measured the skin (r = -0.48, P < .001), lips (r = -0.21, P = .001), and lip rhytids (r = -0.32, P < .001) correlated with the reporting of more skin- and lip-related adverse effects. Higher scores for the 8 appearance scales correlated (range, 0.70-0.28; P < .001) with higher scores on the core 10-item FACE-Q satisfaction with facial appearance scale. In the pretreatment group, older age was significantly correlated with lower scores on 5 of the 6 rhytids scales (exception was forehead rhytids) (range, -0.28 to -0.65; P = .03 to <.001). Pretreatment patients reported significantly lower scores on 7 of the 8 appearance scales compared with posttreatment patients (exception was skin) (P < .001 to .005 on independent sample t tests).The FACE-Q appearance scales and adverse effects checklists can be used in clinical practice, research, and quality improvement to incorporate cosmetic patients' perspective in outcome assessments.