Project description:Untargeted multi-omics analysis of plasma is an emerging tool for the identification of novel biomarkers for evaluating disease prognosis and for a better understanding of molecular mechanisms underlying human disease. The successful application of metabolomic and pro-teomic approaches relies on reproducibly quantifying a wide range of metabolites and proteins. Herein, we report the results of untargeted metabolomic and proteomic analyses from blood plasma samples following analyte extraction by two frequently used solvent systems: chloro-form/methanol and methanol-only. Whole blood samples were collected from participants (n=6) at University Hospital Sharjah (UHS) hospital, then plasma was separated and extracted by two methods i. methanol precipitation and, ii. 4:3 methanol:chloroform extraction. The coverage and reproducibility of the two methods were assessed by ultra-high-performance liquid chromatography-electrospray ionization quadrupole time-of-flight mass spectrometry (UHPLC-ESI-QTOF-MS). The study revealed that metabolite extraction by methanol-only showed greater reproducibility for both metabolomic and proteomic quantifications than did methanol/chloroform, while yielding similar peptide coverage. However, coverage of extracted metabolites was higher with the methanol/chloroform precipitation.
Project description:Gene expression profiling by high-throughput sequencing determines changes in gene expression only at steady state but prevents our understanding of the underlying gene expression kinetics. Here, we describe a protocol that combines metabolic RNA labeling with thiol-specific chemical nucleoside conversion to determine the stability of polyadenylated RNA transcripts. And we provide a targeted mRNA 3 end library preparation protocol that enable to robustly determine the stability even of RNA transcripts that escape robust detection in untargeted libraries. The described methods enable cost-effective insights into the kinetics underlying steady-state gene expression in order to study the mechanisms underlying the regulation of gene expression at a transcript-specific and genomic scale.
Project description:BackgroundUntargeted metabolomics datasets contain large proportions of uninformative features that can impede subsequent statistical analysis such as biomarker discovery and metabolic pathway analysis. Thus, there is a need for versatile and data-adaptive methods for filtering data prior to investigating the underlying biological phenomena. Here, we propose a data-adaptive pipeline for filtering metabolomics data that are generated by liquid chromatography-mass spectrometry (LC-MS) platforms. Our data-adaptive pipeline includes novel methods for filtering features based on blank samples, proportions of missing values, and estimated intra-class correlation coefficients.ResultsUsing metabolomics datasets that were generated in our laboratory from samples of human blood, as well as two public LC-MS datasets, we compared our data-adaptive filtering method with traditional methods that rely on non-method specific thresholds. The data-adaptive approach outperformed traditional approaches in terms of removing noisy features and retaining high quality, biologically informative ones. The R code for running the data-adaptive filtering method is provided at https://github.com/courtneyschiffman/Metabolomics-Filtering .ConclusionsOur proposed data-adaptive filtering pipeline is intuitive and effectively removes uninformative features from untargeted metabolomics datasets. It is particularly relevant for interrogation of biological phenomena in data derived from complex matrices associated with biospecimens.
Project description:Untargeted metabolomics using high-resolution liquid chromatography-mass spectrometry (LC-MS) is becoming one of the major areas of high-throughput biology. Functional analysis, that is, analyzing the data based on metabolic pathways or the genome-scale metabolic network, is critical in feature selection and interpretation of metabolomics data. One of the main challenges in the functional analyses is the lack of the feature identity in the LC-MS data itself. By matching mass-to-charge ratio (m/z) values of the features to theoretical values derived from known metabolites, some features can be matched to one or more known metabolites. When multiple matchings occur, in most cases only one of the matchings can be true. At the same time, some known metabolites are missing in the measurements. Current network/pathway analysis methods ignore the uncertainty in metabolite identification and the missing observations, which could lead to errors in the selection of significant subnetworks/pathways. In this paper, we propose a flexible network feature selection framework that combines metabolomics data with the genome-scale metabolic network. The method adopts a sequential feature screening procedure and machine learning-based criteria to select important subnetworks and identify the optimal feature matching simultaneously. Simulation studies show that the proposed method has a much higher sensitivity than the commonly used maximal matching approach. For demonstration, we apply the method on a cohort of healthy subjects to detect subnetworks associated with the body mass index (BMI). The method identifies several subnetworks that are supported by the current literature, as well as detects some subnetworks with plausible new functional implications. The R code is available at http://web1.sph.emory.edu/users/tyu8/MSS.
Project description:<p><strong>INTRODUCTION:</strong> The extraction solvent mixtures were optimized for untargeted metabolomics analysis of microbial communities from two laboratory scale activated sludge reactors performing enhanced biological phosphorus removal (EBPR).</p><p><strong>OBJECTIVE:</strong> To develop a robust and simple analytical protocol to analyse microbial metabolomics from EBPR bioreactors.</p><p><strong>METHODS:</strong> Extra- and intra-cellular metabolites were extracted using five methods and analysed by ultraperformance liquid chromatography mass spectrometry (UPLC-MS).</p><p><strong>RESULTS:</strong> The optimal extraction method was biomass specific and methanol:water (1:1 v/v) and methanol:chloroform:water (2:2:1 v/v) were chosen, respectively, for each of the two different bioreactors.</p><p><strong>CONCLUSION:</strong> Our approach provides direct surveys of the metabolic state of PAO-enriched EBPR communities, showing that extraction methods should be carefully tailored to the microbial community under study</p>
Project description:MotivationWhen metabolites are analyzed by electrospray ionization (ESI)-mass spectrometry, they are usually detected as multiple ion species due to the presence of isotopes, adducts and in-source fragments. The signals generated by these degenerate features (along with contaminants and other chemical noise) obscure meaningful patterns in MS data, complicating both compound identification and downstream statistical analysis. To address this problem, we developed Binner, a new tool for the discovery and elimination of many degenerate feature signals typically present in untargeted ESI-LC-MS metabolomics data.ResultsBinner generates feature annotations and provides tools to help users visualize informative feature relationships that can further elucidate the underlying structure of the data. To demonstrate the utility of Binner and to evaluate its performance, we analyzed data from reversed phase LC-MS and hydrophilic interaction chromatography (HILIC) platforms and demonstrated the accuracy of selected annotations using MS/MS. When we compared Binner annotations of 75 compounds previously identified in human plasma samples with annotations generated by three similar tools, we found that Binner achieves superior performance in the number and accuracy of annotations while simultaneously minimizing the number of incorrectly annotated principal ions. Data reduction and pattern exploration with Binner have allowed us to catalog a number of previously unrecognized complex adducts and neutral losses generated during the ionization of molecules in LC-MS. In summary, Binner allows users to explore patterns in their data and to efficiently and accurately eliminate a significant number of the degenerate features typically found in various LC-MS modalities.Availability and implementationBinner is written in Java and is freely available from http://binner.med.umich.edu.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Due to their sensitivity and speed, mass-spectrometry based analytical technologies are widely used to in metabolomics to characterize biological phenomena. To address issues like metadata organization, quality assessment, data processing, data storage, and, finally, submission to public repositories, bioinformatic pipelines of a non-interactive nature are often employed, complementing the interactive software used for initial inspection and visualization of the data. These pipelines often are created as open-source software allowing the complete and exhaustive documentation of each step, ensuring the reproducibility of the analysis of extensive and often expensive experiments. In this paper, we will review the major steps which constitute such a data processing pipeline, discussing them in the context of an open-source software for untargeted MS-based metabolomics experiments recently developed at our institute. The software has been developed by integrating our metaMS R package with a user-friendly web-based application written in Grails. MetaMS takes care of data pre-processing and annotation, while the interface deals with the creation of the sample lists, the organization of the data storage, and the generation of survey plots for quality assessment. Experimental and biological metadata are stored in the ISA-Tab format making the proposed pipeline fully integrated with the Metabolights framework.
Project description:Untargeted metabolomics and lipidomics LC-MS experiments produce complex datasets, usually containing tens of thousands of features from thousands of metabolites whose annotation requires additional MS/MS experiments and expert knowledge. All-ion fragmentation (AIF) LC-MS/MS acquisition provides fragmentation data at no additional experimental time cost. However, analysis of such datasets requires reconstruction of parent-fragment relationships and annotation of the resulting pseudo-MS/MS spectra. Here, we propose a novel approach for automated annotation of isotopologues, adducts, and in-source fragments from AIF LC-MS datasets by combining correlation-based parent-fragment linking with molecular fragment matching. Our workflow focuses on a subset of features rather than trying to annotate the full dataset, saving time and simplifying the process. We demonstrate the workflow in three human serum datasets containing 599 features manually annotated by experts. Precision and recall values of 82-92% and 82-85%, respectively, were obtained for features found in the highest-rank scores (1-5). These results equal or outperform those obtained using MS-DIAL software, the current state of the art for AIF data annotation. Further validation for other biological matrices and different instrument types showed variable precision (60-89%) and recall (10-88%) particularly for datasets dominated by nonlipid metabolites. The workflow is freely available as an open-source R package, MetaboAnnotatoR, together with the fragment libraries from Github (https://github.com/gggraca/MetaboAnnotatoR).