Project description:Metabolism is recognized as an important driver of cancer progression and other complex diseases, but global metabolite profiling remains a challenge. Protein expression profiling is often a poor proxy since existing pathway enrichment models provide an incomplete mapping between the proteome and metabolism. To overcome these gaps, we introduce multiomic metabolic enrichment network analysis (MOMENTA), an integrative multiomic data analysis framework for more accurately deducing metabolic pathway changes from proteomics data alone in a gene set analysis context by leveraging protein interaction networks to extend annotated metabolic models. We apply MOMENTA to proteomic data from diverse cancer cell lines and human tumors to demonstrate its utility at revealing variation in metabolic pathway activity across cancer types, which we verify using independent metabolomics measurements. The novel metabolic networks we uncover in breast cancer and other tumors are linked to clinical outcomes, underscoring the pathophysiological relevance of the findings.
Project description:MotivationAdvances in omics technologies have revolutionized cancer research by producing massive datasets. Common approaches to deciphering these complex data are by embedding algorithms of molecular interaction networks. These algorithms find a low-dimensional space in which similarities between the network nodes are best preserved. Currently available embedding approaches mine the gene embeddings directly to uncover new cancer-related knowledge. However, these gene-centric approaches produce incomplete knowledge, since they do not account for the functional implications of genomic alterations. We propose a new, function-centric perspective and approach, to complement the knowledge obtained from omic data.ResultsWe introduce our Functional Mapping Matrix (FMM) to explore the functional organization of different tissue-specific and species-specific embedding spaces generated by a Non-negative Matrix Tri-Factorization algorithm. Also, we use our FMM to define the optimal dimensionality of these molecular interaction network embedding spaces. For this optimal dimensionality, we compare the FMMs of the most prevalent cancers in human to FMMs of their corresponding control tissues. We find that cancer alters the positions in the embedding space of cancer-related functions, while it keeps the positions of the noncancer-related ones. We exploit this spacial 'movement' to predict novel cancer-related functions. Finally, we predict novel cancer-related genes that the currently available methods for gene-centric analyses cannot identify; we validate these predictions by literature curation and retrospective analyses of patient survival data.Availability and implementationData and source code can be accessed at https://github.com/gaiac/FMM.
Project description:To better understand dynamic disease processes, integrated multi-omic methods are needed, yet comparing different types of omic data remains difficult. Integrative solutions benefit experimenters by eliminating potential biases that come with single omic analysis. We have developed the methods needed to explore whether a relationship exists between co-expression network models built from transcriptomic and proteomic data types, and whether this relationship can be used to improve the disease signature discovery process. A naïve, correlation based method is utilized for comparison. Using publicly available infectious disease time series data, we analyzed the related co-expression structure of the transcriptome and proteome in response to SARS-CoV infection in mice. Transcript and peptide expression data was filtered using quality scores and subset by taking the intersection on mapped Entrez IDs. Using this data set, independent co-expression networks were built. The networks were integrated by constructing a bipartite module graph based on module member overlap, module summary correlation, and correlation to phenotypes of interest. Compared to the module level results, the naïve approach is hindered by a lack of correlation across data types, less significant enrichment results, and little functional overlap across data types. Our module graph approach avoids these problems, resulting in an integrated omic signature of disease progression, which allows prioritization across data types for down-stream experiment planning. Integrated modules exhibited related functional enrichments and could suggest novel interactions in response to infection. These disease and platform-independent methods can be used to realize the full potential of multi-omic network signatures. The data (experiment SM001) are publically available through the NIAID Systems Virology (https://www.systemsvirology.org) and PNNL (http://omics.pnl.gov) web portals. Phenotype data is found in the supplementary information. The ProCoNA package is available as part of Bioconductor 2.13.
Project description:Over the past decades, massive amounts of protein-protein interaction (PPI) data have been accumulated due to the advancement of high-throughput technologies, and but data quality issues (noise or incompleteness) of PPI have been still affecting protein function prediction accuracy based on PPI networks. Although two main strategies of network reconstruction and edge enrichment have been reported on the effectiveness of boosting the prediction performance in numerous literature studies, there still lack comparative studies of the performance differences between network reconstruction and edge enrichment. Inspired by the question, this study first uses three protein similarity metrics (local, global and sequence) for network reconstruction and edge enrichment in PPI networks, and then evaluates the performance differences of network reconstruction, edge enrichment and the original networks on two real PPI datasets. The experimental results demonstrate that edge enrichment work better than both network reconstruction and original networks. Moreover, for the edge enrichment of PPI networks, the sequence similarity outperformes both local and global similarity. In summary, our study can help biologists select suitable pre-processing schemes and achieve better protein function prediction for PPI networks.
Project description:MotivationThe increasing availability of multi-omic data has enabled the discovery of disease biomarkers in different scales. Understanding the functional interaction between multi-omic biomarkers is becoming increasingly important due to its great potential for providing insights of the underlying molecular mechanism.ResultsLeveraging multiple biological network databases, we integrated the relationship between single nucleotide polymorphisms (SNPs), genes/proteins and metabolites, and developed an R package Multi-omic Network Explorer Tool (MoNET) for multi-omic network analysis. This new tool enables users to not only track down the interaction of SNPs/genes with metabolome level, but also trace back for the potential risk variants/regulators given altered genes/metabolites. MoNET is expected to advance our understanding of the multi-omic findings by unveiling their transomic interactions and is likely to generate new hypotheses for further validation.Availability and implementationThe MoNET package is freely available on https://github.com/JW-Yan/MONET.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:CIC encodes a transcriptional repressor and MAPK signalling effector that is inactivated by loss-of-function mutations in several cancer types, consistent with a role as a tumour suppressor. Here, we used bioinformatic, genomic, and proteomic approaches to investigate CIC's interaction networks. We observed both previously identified and novel candidate interactions between CIC and SWI/SNF complex members, as well as novel interactions between CIC and cell cycle regulators and RNA processing factors. We found that CIC loss is associated with an increased frequency of mitotic defects in human cell lines and an in vivo mouse model and with dysregulated expression of mitotic regulators. We also observed aberrant splicing in CIC-deficient cell lines, predominantly at 3' and 5' untranslated regions of genes, including genes involved in MAPK signalling, DNA repair, and cell cycle regulation. Our study thus characterises the complexity of CIC's functional network and describes the effect of its loss on cell cycle regulation, mitotic integrity, and transcriptional splicing, thereby expanding our understanding of CIC's potential roles in cancer. In addition, our work exemplifies how multi-omic, network-based analyses can be used to uncover novel insights into the interconnected functions of pleiotropic genes/proteins across cellular contexts.
Project description:Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LC)-mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5-10 min, depending on user experience; data processing typically takes 1-3 h, and data analysis takes ∼30 min.
Project description:Prostate cancer is the most commonly diagnosed malignancy and the third leading cause of cancer deaths. GWAS have identified variants associated with prostate cancer susceptibility, however, mechanistic and functional validation of these mutations are lacking. We used CRISPR-Cas9 genome editing to introduce a missense variant identified in the ELAC2 gene, which encodes a dually localized nuclear and mitochondrial RNA processing enzyme, into the mouse Elac2 gene as well as to generate a prostate-specific knockout of Elac2. These mutations caused enlargement and inflammation of the prostate and nodule formation. The Elac2 variant or knockout mice on the background of the transgenic adenocarcinoma of the mouse prostate (TRAMP) model show that Elac2 mutation with a secondary genetic insult exacerbated the onset and progression of prostate cancer. Multi-omic profiling revealed defects in energy metabolism that activated proinflammatory and tumorigenic pathways as a consequence of impaired non-coding RNA processing and reduced protein synthesis. Our physiologically relevant models show that the ELAC2 variant is a predisposing factor for prostate cancer and identify changes that underly the pathogenesis of this cancer.