ABSTRACT: Project Description
Esophagitis is a frequent, but at the molecular level poorly characterized condition with diverse underlying etiologies and treatments. Correct diagnosis can be challenging due to partially overlapping histological features. By proteomic profiling of 55 biopsy specimens representing controls, Reflux- (GERD), Eosinophilic-(EoE), Crohns-(CD), and Herpes simplex (HSV)-esophagitis, as well as Candida albicans infection by LC-MS/MS, we identified distinct signatures and functional networks. Our integrated AI-assisted morphoproteomic approach allows deeper insights in disease-specific molecular alterations and represents a promising tool in esophagitis-related precision medicine.
The FFPE samples were further processed including macrodissection, protein extraction, protein precipitation, protein digestion, and peptide clean up according to a previously published and modified protocol (Buczak et al., 2020). For further details, please refer to the Materials and Methods section of the manuscript.
In depth proteomic characterization of the samples a label free high-resolution LC-MS/MS approach using an Orbitrap Tribrid Fusion mass spectrometer was chosen (operated in DIA mode). Tryptic peptides were loaded onto a µPAC Trapping Column with a pillar diameter of 5 µm, inter-pillar distance of 2.5 µm, pillar length/bed depth of 18 µm, external porosity of 9%, bed channel width of 2 mm and length of 10 mm; pillars are superficially porous with a porous shell thickness of 300 nm and pore sizes in the order of 100 to 200 Å at a flow rate of 10 µl per min in 0.1% trifluoroacetic acid in HPLC-grade water. Peptides were eluted and separated on the PharmaFluidics µPAC nano-LC column: 50 cm µPAC C18 with a pillar diameter of 5 µm, inter-pillar distance of 2.5 µm, pillar length/bed depth of 18 µm, external porosity of 59%, bed channel width of 315 µm and bed length of 50 cm; pillars are superficially porous with a porous shell thickness of 300 nm and pore sizes in the order of 100 to 200 Å by a linear gradient from 2% to 30 % of buffer B (80% acetonitrile and 0.08% formic acid in HPLC-grade water) in buffer A (2% acetonitrile and 0.1% formic acid in HPLC-grade water) at a flow rate of 300 nl per min. The remaining peptides were eluted by a short gradient of 10 minutes from 30% to 95% buffer B; followed by 25 minutes at 2% of buffer B, the total gradient run was 120 min.
Spectra were acquired in DIA mode using 50 variable-width windows over the mass range 350-1500 m/z. The Orbitrap was used for MS1 and MS2 detection, with an AGC target for MS1 set to 20x104 and a maximum injection time to 100 ms. MS2 scan range was set between 200 and 2000 m/z, with a minimum of 6 points across the peak. Orbitrap resolution for MS2 was set to 30K, isolation window set to 1.6, AGC target to 50x104 and maximum injection time to 54 ms. MS1 and MS2 data were acquired in centroid mode.
In order to check for retention time (RT) stability, iRT standards (Biognosys) were spiked in each sample according to the manufacturer recommendations, the 11 iRT peptide sequences were manually added to the database and used during DIA-NN search to generate the precursor ion library used for MS data analysis. To reduce the possibility of carry over and cross contamination between the samples, two BSA washes were used between samples, and a trap column wash followed by 2 BSA washes was used every 10 samples sequence.
The above-mentioned workflow is schematically displayed in Fig. 1B.
LC-MS/MS was performed at the Proteome Center Tuebingen (PCT). Here, 250 ng of peptides were loaded onto an Easy-nLC 1200 system coupled to a quadrupole Orbitrap Exploris 480 mass spectrometer (all Thermo Fisher Scientific, Waltham, MA, USA) as previously described (Krauss et al, 2023).
MS raw data files were analyzed using DIA-NN 1.8.1 (Demichev et al, 2020) in library-free mode against the human database (UniProt release March 2024, 20412 proteins). First, a precursor ion library was generated using FASTA digest for library-free search in combination with deep learning-based spectra prediction. An experimental library generated from the DIA-NN search was used for cross-run normalization and mass accuracy correction. Only high-accuracy spectra with a minimum precursor FDR of 0.01, and only tryptic peptides (2 missed Tryptic cleavages) were used for protein quantification. The match between runs option was activated and no shared spectra were used for protein identification. Similarly, Normal, HSV, and Candida samples were searched against reviewed entries of HSV1 (taxonomy id 10298, 125 entries), HSV2 (taxonomy id 10310, 95 entries), and C. albicans (taxonomy id 5476, 1412 entries) downloaded on 04.03.2024, in addition to the human database.