Project description:MotivationDrift tube ion mobility spectrometry coupled with mass spectrometry (DTIMS-MS) is increasingly implemented in high throughput omics workflows, and new informatics approaches are necessary for processing the associated data. To automatically extract arrival times for molecules measured by DTIMS at multiple electric fields and compute their associated collisional cross sections (CCS), we created the PNNL Ion Mobility Cross Section Extractor (PIXiE). The primary application presented for this algorithm is the extraction of data that can then be used to create a reference library of experimental CCS values for use in high throughput omics analyses.ResultsWe demonstrate the utility of this approach by automatically extracting arrival times and calculating the associated CCSs for a set of endogenous metabolites and xenobiotics. The PIXiE-generated CCS values were within error of those calculated using commercially available instrument vendor software.Availability and implementationPIXiE is an open-source tool, freely available on Github. The documentation, source code of the software, and a GUI can be found at https://github.com/PNNL-Comp-Mass-Spec/PIXiE and the source code of the backend workflow library used by PIXiE can be found at https://github.com/PNNL-Comp-Mass-Spec/IMS-Informed-Library .Contacterin.baker@pnnl.gov or thomas.metz@pnnl.gov.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Time of flight secondary ion mass spectrometry (ToF-SIMS) is a powerful surface-sensitive characterization tool allowing the imaging of chemical properties over a wide range of organic and inorganic material systems. This technique allows precise studies of chemical composition with sub-100-nm lateral and nanometer depth spatial resolution. However, comprehensive interpretation of ToF-SIMS results is challenging because of the very large data volume and high dimensionality. Furthermore, investigation of samples with pronounced topographical features is complicated by systematic and measureable shifts in the mass spectrum. In this work we developed an approach for the interpretation of the ToF-SIMS data, based on the advanced data analytics. Along with characterization of the chemical composition, our approach allows extraction of the sample surface morphology from a time of flight registration technique. This approach allows one to perform correlated investigations of surface morphology, biological function, and chemical composition of Arabidopsis roots.
Project description:Neutron spin-echo spectrometers with a position-sensitive detector and operating with extended time-of-flight-tagged wavelength frames are able to collect a comprehensive set of data covering a large range of wavevector and Fourier time space with only a few instrumental settings in a quasi-continuous way. Extracting all the information contained in the raw data and mapping them to a suitable physical space in the most efficient way is a challenge. This article reports algorithms employed in dedicated software, DrSpine (data reduction for spin echo), that achieves this goal and yields reliable representations of the intermediate scattering function S(Q, t) independent of the selected 'binning'.
Project description:In order to preserve the in vivo metabolite levels of cells, a quenching protocol must be quickly executed to avoid degradation of labile metabolites either chemically or biologically. In the case of mammalian cell cultures cultivated in complex media, a wash step previous to quenching is necessary to avoid contamination of the cell pellet with extracellular metabolites, which could distort the real intracellular concentration of metabolites. This is typically achieved either by one or multiple centrifugation/wash steps which delay the time until quenching (even harsh centrifugation requires several minutes for processing until the cells are quenched) or filtration. In this article, we describe and evaluate a two-step optimized protocol based on fast filtration by use of a vacuum pump for quenching and subsequent extraction of intracellular metabolites from CHO (Chinese hamster ovary) suspension cells, which uses commercially available components. The method allows transfer of washed cells into liquid nitrogen within 10-15s of sampling and recovers the entire extraction solution volume. It also has the advantage to remove residual filter filaments in the final sample, thus preventing damage to separation columns during subsequent MS analysis. Relative to other methods currently used in the literature, the resulting energy charge of intracellular adenosine nucleotides was increased to 0.94 compared to 0.90 with cold PBS quenching or 0.82 with cold methanol/AMBIC quenching.
Project description:Urban settlements are rapidly growing outward and upward, with consequences for resource use, greenhouse gas emissions, and ecosystem and public health, but rates of change are uneven around the world. Understanding trajectories and predicting consequences of global urban expansion requires quantifying rates of change with consistent, well-calibrated data. Microwave backscatter data provides important information on upward urban growth - essentially the vertical built-up area. We developed a multi-sensor, multi-decadal, gridded (0.05° lat/lon) data set of global urban microwave backscatter, 1993-2020. Comparison of backscatter from two C-band sensors (ERS and ASCAT) and one Ku-band sensor (QuikSCAT) are made at four invariant non-urban sites (~3500 km2) to evaluate instrument stability and multi-decadal pattern. For urban areas, there was a strong linear correlation (overall R2 = 0.69) between 2015 ASCAT urban backscatter and a continental-scale gridded product of building volume, across 8450 urban grid cells (0.05° × 0.05°) in Europe, China, and the USA. This urban backscatter data set provides a time series characterizing global urban change over the past three decades.
Project description:Early detection of bacteremia is important to prevent antibiotic abuse. Therefore, we aimed to develop a clinically applicable bacteremia prediction model using machine learning technology. Data from two tertiary medical centers' electronic medical records during a 12-year-period were extracted. Multi-layer perceptron (MLP), random forest, and gradient boosting algorithms were applied for machine learning analysis. Clinical data within 12 and 24 hours of blood culture were analyzed and compared. Out of 622,771 blood cultures, 38,752 episodes of bacteremia were identified. In MLP with 128 hidden layer nodes, the area under the receiver operating characteristic curve (AUROC) of the prediction performance in 12- and 24-h data models was 0.762 (95% confidence interval (CI); 0.7617-0.7623) and 0.753 (95% CI; 0.7520-0.7529), respectively. AUROC of causative-pathogen subgroup analysis predictive value for Acinetobacter baumannii bacteremia was the highest at 0.839 (95% CI; 0.8388-0.8394). Compared to primary bacteremia, AUROC of sepsis caused by pneumonia was highest. Predictive performance of bacteremia was superior in younger age groups. Bacteremia prediction using machine learning technology appeared possible for acute infectious diseases. This model was more suitable especially to pneumonia caused by Acinetobacter baumannii. From the 24-h blood culture data, bacteremia was predictable by substituting only the continuously variable values.
Project description:The use of mass-spectrometry-based techniques for global protein profiling of biomedical or environmental experiments has become a major focus in research centered on biomarker discovery; however, one of the most important issues recently highlighted in the new era of omics data generation is the ability to perform analyses in a robust and reproducible manner. This has been hypothesized to be one of the issues hindering the ability of clinical proteomics to successfully identify clinical diagnostic and prognostic biomarkers of disease. P-Mart ( https://pmart.labworks.org ) is a new interactive web-based software environment that enables domain scientists to perform quality-control processing, statistics, and exploration of large-complex proteomics data sets without requiring statistical programming. P-Mart is developed in a manner that allows researchers to perform analyses via a series of modules, explore the results using interactive visualization, and finalize the analyses with a collection of output files documenting all stages of the analysis and a report to allow reproduction of the analysis.
Project description:To non-destructively resolve and diagnose the degradation mechanisms of lithium-ion batteries (LIBs), it is necessary to cross-scale decouple complex kinetic processes through the distribution of relaxation times (DRT). However, LIBs with low interfacial impedance render DRT unreliable without data processing and closed-loop validation. This study proposes a hierarchical analytical framework to enhance timescale resolution and reduce uncertainty, including interfacial impedance reconstruction and multi-dimensional DRT analysis. Interfacial impedance is reconstructed by eliminating simulated inductive and diffusive impedance based on a high-fidelity frequency-domain model. Multi-dimensional DRT decouples solid electrolyte interphase (SEI) and charge transfer (CT) processes by the reversibility of electrochemical reactions with state of charge (SOC) to characterize electrode kinetic evolution driven by SOC and temperature through timescales and peak area. The findings reveal that reconstructed impedance improves the accuracy of identified time constants by ≈20%. Cross-scale DRT results reveal that SOCs below 10% at 25 °C effectively distinguish electrode kinetics due to the high correlation between cathodic CT and SOC. Kinetic metrics characterize that anodic SEI or CT are different control steps limiting the low-temperature performance of different cells. This work underscores the potential of the proposed framework for non-destructive diagnostics of kinetic evolution.
Project description:The rotationally averaged collision cross-section (CCS) determined by ion mobility-mass spectrometry (IM-MS) facilitates the identification of various biomolecules. Although machine learning (ML) models have recently emerged as a highly accurate approach for predicting CCS values, they rely on large data sets from various instruments, calibrants, and setups, which can introduce additional errors. In this study, we identified and validated that ion's polarizability and mass-to-charge ratio (m/z) have the most significant predictive power for traveling-wave IM CCS values in relation to other physicochemical properties of ions. Constructed solely based on these two physicochemical properties, our CCS prediction approach demonstrated high accuracy (mean relative error of <3.0%) even when trained with limited data (15 CCS values). Given its ability to excel with limited data, our approach harbors immense potential for constructing a precisely predicted CCS database tailored to each distinct experimental setup. A Python script for CCS prediction using our approach is freely available at https://github.com/MSBSiriraj/SVR_CCSPrediction under the GNU General Public License (GPL) version 3.
Project description:Enhanced accuracy and high-throughput capability in capturing genetic activities lead ChIP-sequencing technology to be applied prevalently in diverse study for tackling DNA-protein interaction problems. Till now, such questions as deciding suitable ChIP-seq arguments and comparing sample quality still haunt biologists. We propose the methods for answering such questions as deciding optimal argument pairs in global alignment of ChIP sequencing data; then we employ a modern signal processing approach to extract inherent genomic features from the global alignments of transcriptional binding activities; together with pairwise comparison from intra- and inter-sample perspectives; thus we can further determine alignment quality and decide the optimal candidate for multi-source heterogeneous high-throughput sequences. The work provides a practical approach to quantitatively compare the alignment quality for heterogeneous sequencing data, especially in determining the efficiency of transcriptional binding from replicate samples, thus it helps to exploit the potentiality of ChIP-seq for deep comprehension of inherent biological meanings from the high-throughput genomic sequences.