CANVS: an easy-to-use application for the analysis and visualization of mass spectrometry-based protein-protein interaction/association data.
Ontology highlight
ABSTRACT: The elucidation of a protein's interaction/association network is important for defining its biological function. Mass spectrometry-based proteomic approaches have emerged as powerful tools for identifying protein-protein interactions (PPIs) and protein-protein associations (PPAs). However, interactome/association experiments are difficult to interpret, considering the complexity and abundance of data that are generated. Although tools have been developed to identify protein interactions/associations quantitatively, there is still a pressing need for easy-to-use tools that allow users to contextualize their results. To address this, we developed CANVS, a computational pipeline that cleans, analyzes, and visualizes mass spectrometry-based interactome/association data. CANVS is wrapped as an interactive Shiny dashboard with simple requirements, allowing users to interface easily with the pipeline, analyze complex experimental data, and create PPI/A networks. The application integrates systems biology databases such as BioGRID and CORUM to contextualize the results. Furthermore, CANVS features a Gene Ontology tool that allows users to identify relevant GO terms in their results and create visual networks with proteins associated with relevant GO terms. Overall, CANVS is an easy-to-use application that benefits all researchers, especially those who lack an established bioinformatic pipeline and are interested in studying interactome/association data.
Project description:Protein cross-linking mass spectrometry (CL-MS) enables the sensitive detection of protein interactions and the inference of protein complex topology. The detection of chemical cross-links between protein residues can identify intra- and interprotein contact sites or provide physical constraints for molecular modeling of protein structure. Recent innovations in cross-linker design, sample preparation, mass spectrometry, and software tools have significantly improved CL-MS approaches. Although a number of algorithms now exist for the identification of cross-linked peptides from mass spectral data, a dearth of user-friendly analysis tools represent a practical bottleneck to the broad adoption of the approach. To facilitate the analysis of CL-MS data, we developed CLMSVault, a software suite designed to leverage existing CL-MS algorithms and provide intuitive and flexible tools for cross-platform data interpretation. CLMSVault stores and combines complementary information obtained from different cross-linkers and search algorithms. CLMSVault provides filtering, comparison, and visualization tools to support CL-MS analyses and includes a workflow for label-free quantification of cross-linked peptides. An embedded 3D viewer enables the visualization of quantitative data and the mapping of cross-linked sites onto PDB structural models. We demonstrate the application of CLMSVault for the analysis of a noncovalent Cdc34-ubiquitin protein complex cross-linked under different conditions. CLMSVault is open-source software (available at https://gitlab.com/courcelm/clmsvault.git ), and a live demo is available at http://democlmsvault.tyerslab.com/ .
Project description:BackgroundThe spatial distribution and colocalization of functionally related metabolites is analysed in order to investigate the spatial (and functional) aspects of molecular networks. We propose to consider community detection for the analysis of m/z-images to group molecules with correlative spatial distribution into communities so they hint at functional networks or pathway activity. To detect communities, we investigate a spectral approach by optimizing the modularity measure. We present an analysis pipeline and an online interactive visualization tool to facilitate explorative analysis of the results. The approach is illustrated with synthetical benchmark data and two real world data sets (barley seed and glioblastoma section).ResultsFor the barley sample data set, our approach is able to reproduce the findings of a previous work that identified groups of molecules with distributions that correlate with anatomical structures of the barley seed. The analysis of glioblastoma section data revealed that some molecular compositions are locally focused, indicating the existence of a meaningful separation in at least two areas. This result is in line with the prior histological knowledge. In addition to confirming prior findings, the resulting graph structures revealed new subcommunities of m/z-images (i.e. metabolites) with more detailed distribution patterns. Another result of our work is the development of an interactive webtool called GRINE (Analysis of GRaph mapped Image Data NEtworks).ConclusionsThe proposed method was successfully applied to identify molecular communities of laterally co-localized molecules. For both application examples, the detected communities showed inherent substructures that could easily be investigated with the proposed visualization tool. This shows the potential of this approach as a complementary addition to pixel clustering methods.
Project description:The rise of intact protein analysis by mass spectrometry (MS) was accompanied by an increasing need for flexible tools allowing data visualization and analysis. These include inspection of the deconvoluted molecular weights of the proteoforms eluted alongside liquid chromatography (LC) through their representation in three-dimensional (3D) liquid chromatography coupled to mass spectrometry (LC-MS) maps (plots of deconvoluted molecular weights, retention times, and intensity of the MS signal). With this aim, we developed a free and open-source web application named VisioProt-MS (https://masstools.ipbs.fr/mstools/visioprot-ms/). VisioProt-MS is highly compatible with many algorithms and software developed by the community to integrate and deconvolute top-down and intact protein MS data. Its dynamic and user-friendly features greatly facilitate analysis through several graphical representations dedicated to MS and tandem mass spectrometry (MS/MS) analysis of proteoforms in complex samples. Here, we will illustrate the importance of LC-MS map visualization to optimize top-down acquisition/search parameters and analyze intact protein MS data. We will go through the main features of VisioProt-MS using the human proteasomal 20S core particle as a user-case.
Project description:BackgroundComprehensive protein-protein interaction (PPI) maps are a powerful resource for uncovering the molecular basis of genetic interactions and providing mechanistic insights. Over the past decade, high-throughput experimental techniques have been developed to generate PPI maps at proteome scale, first using yeast two-hybrid approaches and more recently via affinity purification combined with mass spectrometry (AP-MS). Unfortunately, data from both protocols are prone to both high false positive and false negative rates. To address these issues, many methods have been developed to post-process raw PPI data. However, with few exceptions, these methods only analyze binary experimental data (in which each potential interaction tested is deemed either observed or unobserved), neglecting quantitative information available from AP-MS such as spectral counts.ResultsWe propose a novel method for incorporating quantitative information from AP-MS data into existing PPI inference methods that analyze binary interaction data. Our approach introduces a probabilistic framework that models the statistical noise inherent in observations of co-purifications. Using a sampling-based approach, we model the uncertainty of interactions with low spectral counts by generating an ensemble of possible alternative experimental outcomes. We then apply the existing method of choice to each alternative outcome and aggregate results over the ensemble. We validate our approach on three recent AP-MS data sets and demonstrate performance comparable to or better than state-of-the-art methods. Additionally, we provide an in-depth discussion comparing the theoretical bases of existing approaches and identify common aspects that may be key to their performance.ConclusionsOur sampling framework extends the existing body of work on PPI analysis using binary interaction data to apply to the richer quantitative data now commonly available through AP-MS assays. This framework is quite general, and many enhancements are likely possible. Fruitful future directions may include investigating more sophisticated schemes for converting spectral counts to probabilities and applying the framework to direct protein complex prediction methods.
Project description:Protein-protein interactions (PPIs) are key therapeutic targets. Most PPI-targeting drugs in the clinic inhibit these important interactions; however, stabilising PPIs is an attractive alternative in cases where a PPI is disrupted in a disease state. The discovery of novel PPI stabilisers has been hindered due to the lack of tools available to monitor PPI stabilisation. Moreover, for PPI stabilisation to be detected, both the stoichiometry of binding and the shift this has on the binding equilibria need to be monitored simultaneously. Here, we show the power of native mass spectrometry (MS) in the rapid search for PPI stabilisers. To demonstrate its capability, we focussed on three PPIs between the eukaryotic regulatory protein 14-3-3σ and its binding partners estrogen receptor ERα, the tumour suppressor p53, and the kinase LRRK2, whose interactions upon the addition of a small molecule, fusicoccin A, are differentially stabilised. Within a single measurement the stoichiometry and binding equilibria between 14-3-3 and each of its binding partners was evident. Upon addition of the fusicoccin A stabiliser, a dramatic shift in binding equilibria was observed with the 14-3-3:ERα complex compared with the 14-3-3:p53 and 14-3-3:LRRK2 complexes. Our results highlight how native MS can not only distinguish the ability of stabilisers to modulate PPIs, but also give important insights into the dynamics of ternary complex formation. Finally, we show how native MS can be used as a screening tool to search for PPI stabilisers, highlighting its potential role as a primary screening technology in the hunt for novel therapeutic PPI stabilisers.
Project description:ProXL is a Web application and accompanying database designed for sharing, visualizing, and analyzing bottom-up protein cross-linking mass spectrometry data with an emphasis on structural analysis and quality control. ProXL is designed to be independent of any particular software pipeline. The import process is simplified by the use of the ProXL XML data format, which shields developers of data importers from the relative complexity of the relational database schema. The database and Web interfaces function equally well for any software pipeline and allow data from disparate pipelines to be merged and contrasted. ProXL includes robust public and private data sharing capabilities, including a project-based interface designed to ensure security and facilitate collaboration among multiple researchers. ProXL provides multiple interactive and highly dynamic data visualizations that facilitate structural-based analysis of the observed cross-links as well as quality control. ProXL is open-source, well-documented, and freely available at https://github.com/yeastrc/proxl-web-app .
Project description:BackgroundCutaneous leishmaniasis is caused by several Leishmania species that are associated with variable outcomes before and after therapy. Optimal treatment decision is based on an accurate identification of the infecting species but current methods to type Leishmania isolates are relatively complex and/or slow. Therefore, the initial treatment decision is generally presumptive, the infecting species being suspected on epidemiological and clinical grounds. A simple method to type cultured isolates would facilitate disease management.MethodologyWe analyzed MALDI-TOF spectra of promastigote pellets from 46 strains cultured in monophasic medium, including 20 short-term cultured isolates from French travelers (19 with CL, 1 with VL). As per routine procedure, clinical isolates were analyzed in parallel with Multilocus Sequence Typing (MLST) at the National Reference Center for Leishmania.Principal findingsAutomatic dendrogram analysis generated a classification of isolates consistent with reference determination of species based on MLST or hsp70 sequencing. A minute analysis of spectra based on a very simple, database-independent analysis of spectra based on the algorithm showed that the mutually exclusive presence of two pairs of peaks discriminated isolates considered by reference methods to belong either to the Viannia or Leishmania subgenus, and that within each subgenus presence or absence of a few peaks allowed discrimination to species complexes level.Conclusions/significanceAnalysis of cultured Leishmania isolates using mass spectrometry allows a rapid and simple classification to the species complex level consistent with reference methods, a potentially useful method to guide treatment decision in patients with cutaneous leishmaniasis.
Project description:Drugs are often metabolized to reactive intermediates that form protein adducts. Adducts can inhibit protein activity, elicit immune responses, and cause life-threatening adverse drug reactions. The masses of reactive metabolites are frequently unknown, rendering traditional mass spectrometry-based proteomics approaches incapable of adduct identification. Here, we present Magnum, an open-mass search algorithm optimized for adduct identification, and Limelight, a web-based data processing package for analysis and visualization of data from all existing algorithms. Limelight incorporates tools for sample comparisons and xenobiotic-adduct discovery. We validate our tools with three drug/protein combinations and apply our label-free workflow to identify novel xenobiotic-protein adducts in CYP3A4. Our new methods and software enable accurate identification of xenobiotic-protein adducts with no prior knowledge of adduct masses or protein targets. Magnum outperforms existing label-free tools in xenobiotic-protein adduct discovery, while Limelight fulfills a major need in the rapidly developing field of open-mass searching, which until now lacked comprehensive data visualization tools.
Project description:We present a statistical method SAINT-MS1 for scoring protein-protein interactions based on the label-free MS1 intensity data from affinity purification-mass spectrometry (AP-MS) experiments. The method is an extension of Significance Analysis of INTeractome (SAINT), a model-based method previously developed for spectral count data. We reformulated the statistical model for log-transformed intensity data, including adequate treatment of missing observations, that is, interactions identified in some but not all replicate purifications. We demonstrate the performance of SAINT-MS1 using two recently published data sets: a small LTQ-Orbitrap data set with three replicate purifications of single human bait protein and control purifications and a larger drosophila data set targeting insulin receptor/target of rapamycin signaling pathway generated using an LTQ-FT instrument. Using the drosophila data set, we also compare and discuss the performance of SAINT analysis based on spectral count and MS1 intensity data in terms of the recovery of orthologous and literature-curated interactions. Given rapid advances in high mass accuracy instrumentation and intensity-based label-free quantification software, we expect that SAINT-MS1 will become a useful tool allowing improved detection of protein interactions in label-free AP-MS data, especially in the low abundance range.
Project description:Summary:Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. Availability and implementation:PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. Supplementary information:Supplementary data are available at Bioinformatics online.