ABSTRACT: ABSTRACT: The number of patients affected by chronic diseases with special vaccination needs is burgeoning. In this scenario, predictive markers of immunogenicity, as well as signatures of immune responses are typically missing even though it would especially improve the identification of personalized immunization practices in these populations. We aimed to develop a predictive score of immunogenicity to Influenza Trivalent Inactivated Vaccination (TIV) by applying deep machine learning algorithms using transcriptional data from sort-purified lymphocyte subsets after in vitro stimulation. Peripheral blood mononuclear cells (PBMCs) collected before TIV from 23 vertically HIV infected children under ART and virally controlled were stimulated in vitro with p09/H1N1 peptides (stim) or left unstimulated (med). A multiplexed-qPCR for 96 genes (Fluidigm, Biomark) was made on fixed numbers of 3 B cell subsets, 3 T cell subsets and total PBMCs. The ability to respond to TIV was assessed through hemagglutination Inhibition Assay (HIV) and ELIspot and patients were classified as Responders (R) and Non Responders (NR). A predictive modeling framework was applied to the data set in order to define genes and conditions with the higher predicted probability able to inform the final score. Twelve NR and 11 R were analyzed for gene expression differences in all subsets and 3 conditions (med, stim or Δ (stim-med). Differentially expressed genes between R and NR were selected and tested with the Adaptive Boosting Model to build a prediction score. The score obtained from subsets revealed the best prediction score from 46 genes from 5 different subsets and conditions. Calculating a combined score based on these 5 categories, we achieved a model accuracy of 95.6% and only one misclassified patient. In vitro stimulation, Cell sorting and RNA extraction. PBMC were thawed and cells were counted with Countess Automated Cell counter (Life technology). Cells were resuspended in complete RPMI medium at a concentration of 5 x 106 PBMCs/mL and left at 37°C for 16 hours in the presence or absence of of H1N1 A/California /09 HA peptides in a final concentration 20uL/mL. PMBCs were stained for surface markers, Vivid (Pacific Blue), CD10 (PECy7), CD20 (PE), CD27 (APC), IgD (FITC), CD21 (PECy5) for the B cell panel for 15 minutes and for CD3 (AmCyan), CD4 (PerCP Cy5.5), CD45RO (ECD), CCR7 (Alexa Fluor 700), and CXCR5 (Alexa Fluor 647) and a live/dead marker (ViViD; Molecular Probes) for the T cell panel for 15 minutes. Subsequently, stained PBMCs were washed twice in PBS, finally filtered with a 40 uM mesh and sorted by FACSAriaII (BD Biosciences). The purity of the sorted cell populations were typically >99%. All antibodies were previously titrated. Viable lymphocytes were identified as live dead amine dye negative (ViViD-) cells (Invitrogen). 500 live cells per B and T cell subset were sorted into tubes previously loaded with 9uL of PCR buffer (see also Figure 1 for gating strategy). After sorting, cells were immediately centrifuged (3000RPM for 3 minutes) and kept on ice. Samples were subsequently transferred in PCR tubes and 18 PCR cycles were performed on a C1000 Thermal Cycler (Bio Rad) with the following scheme (50°C for 20’, 95°C for 2’, 95°C 15’’, 60°C for 4’. Last step repeated 18 times). Cells were finally kept at -20 until further analysis. PCR buffer premix for cell sorting contained the following: Cells Direct Reaction mix 5uL, DEPC water 1,4, Superscript III + Taq 1ul, 0.2x diluted assay (96 primer mix) 2,5uL, Superasein 0,1uL. Multiplexed RT-PCR Previously amplified samples were loaded on a Fluidigm 96.96 standard chip following manufacturer’s instructions. Briefly, assay pre-mix was prepared 1:1 20X TaqMan Gene Expression Assay (Applied Biosystems) and Assay Loading Reagent (Fluidigm). The sample pre-mix was prepared with TaqMan Universal PCR Master Mix (2X)(Applied Biosystems), 20XGE Sample Loading Reagent (Fluidigm), and cDNA. Full list of the two panels of gene probes (B subsets and T subsets is shown in supplemental table 1 and 2). 5ul of Assay and Sample mix were loaded into the chip according to manufacturers instructions. Genes’ selection has been made according to previous analysis on RNA Sequencing on HIV infected children from a different cohort (data not shown), the literature and online gene banks and biological queries. Cycle threshold value (Ct) deriving from exported files was corrected according to number of cells sorted if lower than 500. Calculations were made following the expression 67,5 / 500 = Y / X where X is the number of cells sorted and Y is the cells equivalent cDNA of cell sorted. The dilution factor (n) was calculated as n= 67,5/Y, and base 2 log of n was subsequently subtracted to Ct value in order to get Corrected Cycle Threshold (c-Ct). Expression threshold (Et), which was used for the main analysis was finally obtained with 40-cCT. Once exported and corrected, data were analyzed through Fluidigm SingluaR (SingulaR analysis toolset 3.0) package, loaded on R (software R 3.0.2 GUI 1.62). As previously described (De Armas LR, 2017) gene expression differences between different groups within same subset and condition were used to identify Differentially Expressed Genes (DEGs). Alternatively, paired gene expression differences between stimulated (stim) and unstimulated samples (med) (stim-med) within the same subset were used to define Differentially Induced Genes (DIGs).