Project description:We present a method for the systematic identification of picogram quantities of new lipids in total extracts of tissues and fluids. It relies on the modularity of lipid structures and applies all-ions fragmentation LC-MS/MS and Arcadiate software to recognize individual modules originating from the same lipid precursor of known or assumed structure. In this way it alleviates the need to recognize and fragment very low abundant precursors of novel molecules in complex lipid extracts. In a single analysis of rat kidney extract the method identified 58 known and discovered 74 novel endogenous endocannabinoids and endocannabinoid-related molecules, including a novel class of N-acylaspartates that inhibit Hedgehog signaling while having no impact on endocannabinoid receptors.
Project description:BackgroundUntargeted metabolomics datasets contain large proportions of uninformative features that can impede subsequent statistical analysis such as biomarker discovery and metabolic pathway analysis. Thus, there is a need for versatile and data-adaptive methods for filtering data prior to investigating the underlying biological phenomena. Here, we propose a data-adaptive pipeline for filtering metabolomics data that are generated by liquid chromatography-mass spectrometry (LC-MS) platforms. Our data-adaptive pipeline includes novel methods for filtering features based on blank samples, proportions of missing values, and estimated intra-class correlation coefficients.ResultsUsing metabolomics datasets that were generated in our laboratory from samples of human blood, as well as two public LC-MS datasets, we compared our data-adaptive filtering method with traditional methods that rely on non-method specific thresholds. The data-adaptive approach outperformed traditional approaches in terms of removing noisy features and retaining high quality, biologically informative ones. The R code for running the data-adaptive filtering method is provided at https://github.com/courtneyschiffman/Metabolomics-Filtering .ConclusionsOur proposed data-adaptive filtering pipeline is intuitive and effectively removes uninformative features from untargeted metabolomics datasets. It is particularly relevant for interrogation of biological phenomena in data derived from complex matrices associated with biospecimens.
Project description:Pooled quality controls (QCs) are usually implemented within untargeted methods to improve the quality of datasets by removing features either not detected or not reproducible. However, this approach can be limiting in exposomics studies conducted on groups of exposed and nonexposed subjects, as compounds present at low levels only in exposed subjects can be diluted and thus not detected in the pooled QC. The aim of this work is to develop and apply an untargeted workflow for human biomonitoring in urine samples, implementing a novel separated approach for preparing pooled quality controls. An LC-MS/MS workflow was developed and applied to a case study of smoking and non-smoking subjects. Three different pooled quality controls were prepared: mixing an aliquot from every sample (QC-T), only from non-smokers (QC-NS), and only from smokers (QC-S). The feature tables were filtered using QC-T (T-feature list), QC-S, and QC-NS, separately. The last two feature lists were merged (SNS-feature list). A higher number of features was obtained with the SNS-feature list than the T-feature list, resulting in identification of a higher number of biologically significant compounds. The separated pooled QC strategy implemented can improve the nontargeted human biomonitoring for groups of exposed and nonexposed subjects.
Project description:Untargeted metabolomics using high-resolution liquid chromatography-mass spectrometry (LC-MS) is becoming one of the major areas of high-throughput biology. Functional analysis, that is, analyzing the data based on metabolic pathways or the genome-scale metabolic network, is critical in feature selection and interpretation of metabolomics data. One of the main challenges in the functional analyses is the lack of the feature identity in the LC-MS data itself. By matching mass-to-charge ratio (m/z) values of the features to theoretical values derived from known metabolites, some features can be matched to one or more known metabolites. When multiple matchings occur, in most cases only one of the matchings can be true. At the same time, some known metabolites are missing in the measurements. Current network/pathway analysis methods ignore the uncertainty in metabolite identification and the missing observations, which could lead to errors in the selection of significant subnetworks/pathways. In this paper, we propose a flexible network feature selection framework that combines metabolomics data with the genome-scale metabolic network. The method adopts a sequential feature screening procedure and machine learning-based criteria to select important subnetworks and identify the optimal feature matching simultaneously. Simulation studies show that the proposed method has a much higher sensitivity than the commonly used maximal matching approach. For demonstration, we apply the method on a cohort of healthy subjects to detect subnetworks associated with the body mass index (BMI). The method identifies several subnetworks that are supported by the current literature, as well as detects some subnetworks with plausible new functional implications. The R code is available at http://web1.sph.emory.edu/users/tyu8/MSS.
Project description:LC-MS-based untargeted metabolomics is heavily dependent on algorithms for automated peak detection and data preprocessing due to the complexity and size of the raw data generated. These algorithms are generally designed to be as inclusive as possible in order to minimize the number of missed peaks. This is known to result in an abundance of false positive peaks that further complicate downstream data processing and analysis. As a consequence, considerable effort is spent identifying features of interest that might represent peak detection artifacts. Here, we present the CPC algorithm, which allows automated characterization of detected peaks with subsequent filtering of low quality peaks using quality criteria familiar to analytical chemists. We provide a thorough description of the methods in addition to applying the algorithms to authentic metabolomics data. In the example presented, the algorithm removed about 35% of the peaks detected by XCMS, a majority of which exhibited a low signal-to-noise ratio. The algorithm is made available as an R-package and can be fully integrated into a standard XCMS workflow.
Project description:MotivationWhen metabolites are analyzed by electrospray ionization (ESI)-mass spectrometry, they are usually detected as multiple ion species due to the presence of isotopes, adducts and in-source fragments. The signals generated by these degenerate features (along with contaminants and other chemical noise) obscure meaningful patterns in MS data, complicating both compound identification and downstream statistical analysis. To address this problem, we developed Binner, a new tool for the discovery and elimination of many degenerate feature signals typically present in untargeted ESI-LC-MS metabolomics data.ResultsBinner generates feature annotations and provides tools to help users visualize informative feature relationships that can further elucidate the underlying structure of the data. To demonstrate the utility of Binner and to evaluate its performance, we analyzed data from reversed phase LC-MS and hydrophilic interaction chromatography (HILIC) platforms and demonstrated the accuracy of selected annotations using MS/MS. When we compared Binner annotations of 75 compounds previously identified in human plasma samples with annotations generated by three similar tools, we found that Binner achieves superior performance in the number and accuracy of annotations while simultaneously minimizing the number of incorrectly annotated principal ions. Data reduction and pattern exploration with Binner have allowed us to catalog a number of previously unrecognized complex adducts and neutral losses generated during the ionization of molecules in LC-MS. In summary, Binner allows users to explore patterns in their data and to efficiently and accurately eliminate a significant number of the degenerate features typically found in various LC-MS modalities.Availability and implementationBinner is written in Java and is freely available from http://binner.med.umich.edu.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Untargeted metabolomics can detect more than 10 000 peaks in a single LC-MS run. The correspondence between these peaks and metabolites, however, remains unclear. Here, we introduce a Peak Annotation and Verification Engine (PAVE) for annotating untargeted microbial metabolomics data. The workflow involves growing cells in 13C and 15N isotope-labeled media to identify peaks from biological compounds and their carbon and nitrogen atom counts. Improved deisotoping and deadducting are enabled by algorithms that integrate positive mode, negative mode, and labeling data. To distinguish metabolites and their fragments, PAVE experimentally measures the response of each peak to weak in-source collision induced dissociation, which increases the peak intensity for fragments while decreasing it for their parent ions. The molecular formulas of the putative metabolites are then assigned based on database searching using both m/ z and C/N atom counts. Application of this procedure to Saccharomyces cerevisiae and Escherichia coli revealed that more than 80% of peaks do not label, i.e., are environmental contaminants. More than 70% of the biological peaks are isotopic variants, adducts, fragments, or mass spectrometry artifacts yielding ∼2000 apparent metabolites across the two organisms. About 650 match to a known metabolite formula based on m/ z and C/N atom counts, with 220 assigned structures based on MS/MS and/or retention time to match to authenticated standards. Thus, PAVE enables systematic annotation of LC-MS metabolomics data with only ∼4% of peaks annotated as apparent metabolites.
Project description:Sarcopenia, a multifactorial systemic disorder, has attracted extensive attention, yet its pathogenesis is not fully understood, partly due to limited research on the relationship between lipid metabolism abnormalities and sarcopenia. Lipidomics offers the possibility to explore this relationship. Our research utilized LC/MS-based nontargeted lipidomics to investigate the lipid profile changes as-sociated with sarcopenia, aiming to enhance understanding of its underlying mechanisms. The study included 40 sarcopenia patients and 40 control subjects matched 1:1 by sex and age. Plasma lipids were detected and quantified, with differential lipids identified through univariate and mul-tivariate statistical analyses. A weighted correlation network analysis (WGCNA) and MetaboAna-lyst were used to identify lipid modules related to the clinical traits of sarcopenia patients and to conduct pathway analysis, respectively. A total of 34 lipid subclasses and 1446 lipid molecules were detected. Orthogonal partial least squares discriminant analysis (OPLS-DA) identified 80 differen-tial lipid molecules, including 38 phospholipids. Network analysis revealed that the brown module (encompassing phosphatidylglycerol (PG) lipids) and the yellow module (containing phosphati-dylcholine (PC), phosphatidylserine (PS), and sphingomyelin (SM) lipids) were closely associated with the clinical traits such as maximum grip strength and skeletal muscle mass (SMI). Pathway analysis highlighted the potential role of the glycerophospholipid metabolic pathway in lipid me-tabolism within the context of sarcopenia. These findings suggest a correlation between sarcopenia and lipid metabolism disturbances, providing valuable insights into the disease's underlying mechanisms and indicating potential avenues for further investigation.
Project description:Metformin, an anti-diabetes drug, has been recently emerging as a potential "anti-aging" intervention based on its reported beneficial actions against aging in preclinical studies. Nonetheless, very few metformin studies using mice have determined metformin concentrations and many effects of metformin have been observed in preclinical studies using doses/concentrations that were not relevant to therapeutic levels in human. We developed a liquid chromatography-tandem mass spectrometry protocol for metformin measurement in plasma, liver, brain, kidney, and muscle of mice. Young adult male and female C57BL/6 mice were voluntarily treated with metformin of 4 mg/ml in drinking water which translated to the maximum dose of 2.5 g/day in humans. A clinically relevant steady-state plasma metformin concentrations were achieved at 7 and 30 days after treatment in male and female mice. Metformin concentrations were slightly higher in muscle than in plasma, while, ~3 and 6-fold higher in the liver and kidney than in plasma, respectively. Low metformin concentration was found in the brain at ~20% of the plasma level. Furthermore, gender difference in steady-state metformin bio-distribution was observed. Our study established steady-state metformin levels in plasma, liver, muscle, kidney, and brain of normoglycemic mice treated with a clinically relevant dose, providing insight into future metformin preclinical studies for potential clinical translation.