ABSTRACT: Plasmids encoding full-length open-reading frames (ORF) of heterogeneous nuclear ribonucleoproteins fused to a HaloTag (HT) were transfected in human HEK293T cells in biological duplicates. The expressed proteins were: HNRNPA0 (NM_006805; NP_006796.1), HNRNPA1 (NM_002136; NP_002127.1), HNRNPD (NM_002138; NP_002129.2), HNRNPF (NM_001098206; NP_001091676.1), HNRNPH1 (NM_005520; NP_005511.1), HNRNPK (NM_001318187; NP_001305116.1), HNRNPM (NM_005968; NP_005959.2), HNRNPR (NM_005826; NP_005817.1), HNRNPU (NM_031844; NP_114032.2), HNRNPUL1 (NM_007040; NP_008971.2), RBFOX2 (NM_001082578; NP_001076047.1), and UPF1 (NM_002911; NP_002902.2).
Cells were lysed 24 hours post-transfection, and half of the lysates were left untreated, while the remaining half were subjected to stringent RNase digestion (2ul RNAse A; A797C 4mg/ml for each 12 million cell pellet lysate).
Protein complexes were covalently captured on an HT affinity resin and interacting proteins were eluted and purified for mass spectrometry as described (Daniels, D.L., et al., (2012) J. Proteome Res., 11(2):564-75).
TCA-precipitated roteins were digested with endoproteinase LysC followed by trypsin. The resulting peptide mixtures were analyzed by Multidimensional Protein Identification Technology (MudPIT) on an LTQ ion trap mass spectrometer as described (Florens L, Washburn MP. (2006) Methods Mol. Biol., 328:159-75).
RAW files were extracted into .ms2 file format using RawDistiller v. 1.0 (Zhang, Y., Wen, Z., Washburn, M. P., and Florens, L. (2011) Anal. Chem. 83, 9344-9351.). The MS/MS spectra were searched using SEQUEST v.27 rev.9 (Eng, McCormack, and Yates (1994) J. Amer. Mass Spectrom. 5, 976-989.) with a peptide mass tolerance of 3 amu and of +/- 0.5 amu for fragment ions. To account for alkylation by chloroacetamide (CAM), 57.02146 Da were added statically to cysteine residues for all searches. No enzyme specificity was imposed during the SEQUEST searches against a protein database containing 29375 non-redundant Homo sapiens proteins (NCBI 2010-11-22 release), as well as 163 usual contaminants such as human keratins, IgGs and proteolytic enzymes. To estimate false discovery rates (FDR), each protein sequence was randomized (keeping the same amino acid composition and length) and the resulting "shuffled" sequences were added to the database used for the SEQUEST searches, for a total search space of 59076 amino acid sequences.
Spectra/peptide matches (PSMs) were filtered using conservative criteria using DTASelect (Tabb, McDonald, and Yates (2002) J. Proteome Res. 1, 21-26). PSMs were only retained if they had a DeltCn of at least 0.08. Minimum XCorr values were set at 1.8 for singly-, 2.0 for doubly-, and 3.0 for triply-charged spectra. In addition, peptides had to be at least 7 amino acids long and fully tryptic.
Compressed and archived directories containing the complete mass spectrometry dataset for each anlayzed sample: mass spectrometry files (.raw), peak files (.ms2), SEQUEST search files (sequest.params and .sqt), as well as DTASelect result files (DTASelect.params and DTASelect.txt/html). Use "tar -xzf" command in Linux to restore folders for each sample and files therein.