Project description:The role of stem cells in solid tumors remains controversial. In colorectal cancers (CRC), this is complicated by the conflicting ‘top-down’ or ‘bottom-up’ hypothesis of cancer initiation. We profiled the expressions of genes from the top (T) and bottom (B) fractions of the crypt in morphologically normal-appearing colonic mucosa (M) and contrasted this to that of matched mucosa adjacent to tumors (MT) in twenty three sporadic CRC patients. In thirteen patients, the genetic distance (M-MT) between the B fractions is smaller than the distance between the T fractions indicating that the expressions of significant genes diverge further in the top fractions (B<T). In the remaining ten patients, the reverse is observed (B>T). Taking genetic divergence as an intermediate endpoint, the data indicates that it is equally likely that CRC initiates from ‘top-down’ via dedifferentiated colonocytes or ‘bottom-up’ via dysregulated intestinal stem cells. This has important ramification for subsequent therapeutic considerations.
Project description:Raw data and TDReport Files for "Top-Down Proteomics Enables Comparative Analysis of Brain Proteoforms Between Mouse Strains". PTMs were shotgun annotated using UniProt.
Project description:Transcriptional profiling of human colorectal cancer tissues comparing control from histologically normal tissue samples adjacent to the tumors. High throughput bioluminescence imaging, and PET were performed. The overall goal was to determine the differential expression of lncRNA and mRNA between tumor and match normal samples.
Project description:Histones were isolated from brown adipose tissue and liver from mice housed at 28, 22, or 8 C. Quantitative top- or middle-down approaches were used to quantitate histone H4 and H3.2 proteoforms. See published article for complimentary RNA-seq and RRBS datasets.
Project description:Colorectal cancer is the second leading cause of cancer death worldwide, and the incidence of this disease is expected to increase as global socioeconomic changes occur. Immune checkpoint inhibition therapy is effective in treating a minority of colorectal cancer tumors; however, microsatellite stable tumors do not respond well to this treatment. Emerging cancer immunotherapeutic strategies aim to activate a cytotoxic T cell response against tumor-specific antigens, presented exclusively at the cell surface of cancer cells. These antigens are rare and are most effectively identified with a mass spectrometry-based approach, which allows the direct sampling and sequencing of these peptides. While the few tumor-specific antigens identified to date derived from coding regions of the genome, recent findings indicate that a large proportion of tumor-specific antigens originate from allegedly noncoding regions. Here, we employed a novel proteogenomic approach to identify tumor antigens in a collection of colorectal cancer-derived cell lines and biopsy samples consisting of matched tumor and normal adjacent tissue. The generation of personalized cancer databases paired with mass spectrometry analyses permitted the identification of more than 30 000 unique MHC I-associated peptides. We identified 19 putative tumor-specific antigens in both microsatellite stable and unstable tumors, over two-thirds of which were derived from non-coding regions. Many of these peptides were derived from source genes known to be involved in colorectal cancer progression, suggesting that antigens from these genes could have therapeutic potential in a wide range of tumors. These findings could benefit the development of T cell-based vaccines, in which T cells are primed against these antigens to target and eradicate tumors. Such a vaccine could be used in tandem with existing immune checkpoint inhibition therapies, to bridge the gap in treatment efficacy across subtypes of colorectal cancer with varying prognoses.
Project description:Large-scale top-down proteomics characterizes proteoforms in cells globally with high confidence and high throughput using reversed-phase liquid chromatography (RPLC)-tandem mass spectrometry (MS/MS) or capillary zone electrophoresis (CZE)-MS/MS. The false discovery rate (FDR) from the target-decoy database search is typically deployed to filter identified proteoforms to ensure high-confidence identifications (IDs). It has been demonstrated that the FDRs in top-down proteomics can be drastically underestimated. An alternative approach to the FDR can be useful for further evaluating the confidence of proteoform IDs after database search. We argue that predicting retention/migration time of proteoforms from the RPLC/CZE separation accurately and comparing their predicted and experimental separation time could be a useful and practical approach. Based on our knowledge, there is still no report in the literature about predicting separation time of proteoforms using large top-down proteomics datasets. In this pilot study, for the first time, we evaluated various semi-empirical models for predicting proteoforms’ electrophoretic mobility (µef) using large-scale top-down proteomics datasets from CZE-MS/MS. We achieved a linear correlation between experimental and predicted µef of E. coli proteoforms (R2=0.98) with a simple semi-empirical model, which utilizes the number of charges and molecular mass of each proteoform as the parameters. Our modeling data suggest that the complete unfolding of proteoforms during CZE separation benefits the prediction of their µef. Our results also indicate that N-terminal acetylation and phosphorylation both decrease proteoforms’ charge by roughly one charge unit.
Project description:We present a large-scale top-down proteomics study of plant leaf and chloroplast proteins, achieving the identification of over 4700 unique proteoforms. Using capillary zone electrophoresis coupled with tandem mass spectrometry analysis of offline size-exclusion chromatography fractions, we identify 3198 proteoforms for total leaf and 1836 proteoforms for chloroplast, with 1024 and 363 proteoforms having post-translational modifications (PTMs), respectively. The electrophoretic mobility prediction of CZE allowed us to validate PTMs that impact the charge state such as acetylation and phosphorylation. Identified PTMs included Trp (di)oxidation events on six chloroplast proteins that may represent novel targets of singlet oxygen sensing. Furthermore, our top-down proteomics data provides direct experimental evidence of the N- and C-terminal residues of numerous mature proteoforms from chloroplast, mitochondria, endoplasmic reticulum, and other sub-cellular localizations. With this information, we propose transit peptide cleavage sites and correct sub-cellular localization signal predictions. This large-scale analysis illustrates the power of top-down proteoform identification of PTMs and intact sequences that can benefit our understanding of both the structure and function of hundreds of plant proteins.
Project description:Proteomics has exposed a plethora of post-translational modifications, but demonstrating functional relevance requires new approaches. Top-down proteomics can characterize co-occurring modifications in terms of localization, abundance and hierarchy. Here, we present a top-down MS analysis workflow for the discovery and quantification of proteoforms. Confident fragment assignment allows for localization of modification sites and quantification of all proteoforms, including positional isomers, as validated by investigating synthetic isoforms of ubiquitin and hyper-phosphorylated Bora.
Project description:Understanding cancer metastasis at the proteoform level is crucial for discovering new protein biomarkers for cancer diagnosis and drug development. Proteins are the primary effectors of function in biology and proteoforms from the same gene can have drastically different biological functions. Here, we present the first qualitative and quantitative top-down proteomics (TDP) study of a pair of isogenic human metastatic and non-metastatic colorectal cancer (CRC) cell lines (SW480 and SW620). This study pursues a global view of human CRC proteome before and after metastasis in a proteoform-specific manner. We identified 23,319 proteoforms of 2,297 genes from the CRC cell lines using capillary zone electrophoresis-tandem mass spectrometry (CZE-MS/MS), representing nearly one order of magnitude improvement in the number of proteoform identifications from human cell lines compared to literature data. We identified 111 proteoforms containing single amino acid variants (SAAVs) using a proteogenomic approach and revealed drastic differences between the metastatic and non-metastatic cell lines regarding SAAVs profiles. Quantitative TDP analysis unveiled statistically significant differences in proteoform abundance between the SW480 and SW620 cell lines on a proteome scale for the first time. Ingenuity Pathway Analysis (IPA) disclosed that many differentially expressed genes at the proteoform level had diversified functions and were closely related to cancer. Our study represents a milestone in TDP towards the definition of human proteome in a proteoform-specific manner, which will transform basic and translational biomedical research.