Project description:A fourth industrial revolution, occurring in global manufacturing, provides a vision of future manufacturing systems that incorporate highly dynamic physical systems, robust and responsive communications systems, and computing paradigms to maximize efficiency, enable mobility, and realize the promises of the digital factory. Wireless technology is a key enabler of that vision. A comprehensive graphical model is developed for a generic wireless factory work-cell which employs the Systems Modeling Language (SysML), a standardized and semantically rich modeling language, to link the physical and network domains in such a cyber-physical system (CPS). The proposed model identifies the structural primitives, interfaces, and behaviors of the highly-connected factory work-cell in which wireless technology is used for significant data flows involved in control algorithms. The model includes the parametric definitions to encapsulate information loss, delay, and mutation associated with the wireless network, and it identifies pertinent wireless information flows.
Project description:The article deals with BB-SPICE (SPICE for Biochemical and Biological Systems), an extension of the famous Simulation Program with Integrated Circuit Emphasis (SPICE). BB-SPICE environment is composed of three modules: a new textual and compact description formalism for biological systems, a converter that handles this description and generates the SPICE netlist of the equivalent electronic circuit and NGSPICE which is an open-source SPICE simulator. In addition, the environment provides back and forth interfaces with SBML (System Biology Markup Language), a very common description language used in systems biology. BB-SPICE has been developed in order to bridge the gap between the simulation of biological systems on the one hand and electronics circuits on the other hand. Thus, it is suitable for applications at the interface between both domains, such as development of design tools for synthetic biology and for the virtual prototyping of biosensors and lab-on-chip. Simulation results obtained with BB-SPICE and COPASI (an open-source software used for the simulation of biochemical systems) have been compared on a benchmark of models commonly used in systems biology. Results are in accordance from a quantitative viewpoint but BB-SPICE outclasses COPASI by 1 to 3 orders of magnitude regarding the computation time. Moreover, as our software is based on NGSPICE, it could take profit of incoming updates such as the GPU implementation, of the coupling with powerful analysis and verification tools or of the integration in design automation tools (synthetic biology).
Project description:ObjectivesHealth care organizations are increasingly employing social workers to address patients' social needs. However, social work (SW) activities in health care settings are largely captured as text data within electronic health records (EHRs), making measurement and analysis difficult. This study aims to extract and classify, from EHR notes, interventions intended to address patients' social needs using natural language processing (NLP) and machine learning (ML) algorithms.Study designSecondary data analysis of a longitudinal cohort.MethodsWe extracted 815 SW encounter notes from the EHR system of a federally qualified health center. We reviewed the literature to derive a 10-category classification scheme for SW interventions. We applied NLP and ML algorithms to categorize the documented SW interventions in EHR notes according to the 10-category classification scheme.ResultsMost of the SW notes (n = 598; 73.4%) contained at least 1 SW intervention. The most frequent interventions offered by social workers included care coordination (21.5%), education (21.0%), financial planning (18.5%), referral to community services and organizations (17.1%), and supportive counseling (15.3%). High-performing classification algorithms included the kernelized support vector machine (SVM) (accuracy, 0.97), logistic regression (accuracy, 0.96), linear SVM (accuracy, 0.95), and multinomial naive Bayes classifier (accuracy, 0.92).ConclusionsNLP and ML can be utilized for automated identification and classification of SW interventions documented in EHRs. Health care administrators can leverage this automated approach to gain better insight into the most needed social interventions in the patient population served by their organizations. Such information can be applied in managerial decisions related to SW staffing, resource allocation, and patients' social needs.
Project description:Saccharomyces cerevisiae is the most well characterized eukaryote, the preferred microbial cell factory for the largest industrial biotechnology product (bioethanol), and a robust commerically compatible scaffold to be exploitted for diverse chemical production. Succinic acid is a highly sought after added-value chemical for which there is no native pre-disposition for production and accmulation in S. cerevisiae. The genome-scale metabolic network reconstruction of S. cerevisiae enabled in silico gene deletion predictions using an evolutionary programming method to couple biomass and succinate production. Glycine and serine, both essential amino acids required for biomass formation, are formed from both glycolytic and TCA cycle intermediates. Succinate formation results from the isocitrate lyase catalyzed conversion of isocitrate, and from the α-keto-glutarate dehydrogenase catalyzed conversion of α-keto-glutarate. Succinate is subsequently depleted by the succinate dehydrogenase complex. The metabolic engineering strategy identified included deletion of the primary succinate consuming reaction, Sdh3p, and interruption of glycolysis derived serine by deletion of 3-phosphoglycerate dehydrogenase, Ser3p/Ser33p. Pursuing these targets, a multi-gene deletion strain was constructed, and directed evolution with selection used to identify a succinate producing mutant. Physiological characterization coupled with integrated data analysis of transcriptome data in the metabolically engineered strain were used to identify 2(nd)-round metabolic engineering targets. The resulting strain represents a 30-fold improvement in succinate titer, and a 43-fold improvement in succinate yield on biomass, with only a 2.8-fold decrease in the specific growth rate compared to the reference strain. Intuitive genetic targets for either over-expression or interruption of succinate producing or consuming pathways, respectively, do not lead to increased succinate. Rather, we demonstrate how systems biology tools coupled with directed evolution and selection allows non-intuitive, rapid and substantial re-direction of carbon fluxes in S. cerevisiae, and hence show proof of concept that this is a potentially attractive cell factory for over-producing different platform chemicals.
Project description:Predicting which proteins interact together from amino acid sequences is an important task. We develop a method to pair interacting protein sequences which leverages the power of protein language models trained on multiple sequence alignments (MSAs), such as MSA Transformer and the EvoFormer module of AlphaFold. We formulate the problem of pairing interacting partners among the paralogs of two protein families in a differentiable way. We introduce a method called Differentiable Pairing using Alignment-based Language Models (DiffPALM) that solves it by exploiting the ability of MSA Transformer to fill in masked amino acids in multiple sequence alignments using the surrounding context. MSA Transformer encodes coevolution between functionally or structurally coupled amino acids within protein chains. It also captures inter-chain coevolution, despite being trained on single-chain data. Relying on MSA Transformer without fine-tuning, DiffPALM outperforms existing coevolution-based pairing methods on difficult benchmarks of shallow multiple sequence alignments extracted from ubiquitous prokaryotic protein datasets. It also outperforms an alternative method based on a state-of-the-art protein language model trained on single sequences. Paired alignments of interacting protein sequences are a crucial ingredient of supervised deep learning methods to predict the three-dimensional structure of protein complexes. Starting from sequences paired by DiffPALM substantially improves the structure prediction of some eukaryotic protein complexes by AlphaFold-Multimer. It also achieves competitive performance with using orthology-based pairing.
Project description:Understanding the function and fitness effects of diverse plant genomes requires transferable models. Language models (LMs) pre-trained on large-scale biological sequences can learn evolutionary conservation, thus expected to offer better cross-species prediction through fine-tuning on limited labeled data compared to supervised deep learning models. We introduce PlantCaduceus, a plant DNA LM based on the Caduceus and Mamba architectures, pre-trained on a carefully curated dataset consisting of 16 diverse Angiosperm genomes. Fine-tuning PlantCaduceus on limited labeled Arabidopsis data for four tasks involving transcription and translation modeling demonstrated high transferability to maize that diverged 160 million years ago, outperforming the best baseline model by 1.45-fold to 7.23-fold. PlantCaduceus also enables genome-wide deleterious mutation identification without multiple sequence alignment (MSA). PlantCaduceus demonstrated a threefold enrichment of rare alleles in prioritized deleterious mutations compared to MSA-based methods and matched state-of-the-art protein LMs. PlantCaduceus is a versatile pre-trained DNA LM expected to accelerate plant genomics and crop breeding applications.
Project description:MotivationIdentifying the B-cell epitopes is an essential step for guiding rational vaccine development and immunotherapies. Since experimental approaches are expensive and time-consuming, many computational methods have been designed to assist B-cell epitope prediction. However, existing sequence-based methods have limited performance since they only use contextual features of the sequential neighbors while neglecting structural information.ResultsBased on the recent breakthrough of AlphaFold2 in protein structure prediction, we propose GraphBepi, a novel graph-based model for accurate B-cell epitope prediction. For one protein, the predicted structure from AlphaFold2 is used to construct the protein graph, where the nodes/residues are encoded by ESM-2 learning representations. The graph is input into the edge-enhanced deep graph neural network (EGNN) to capture the spatial information in the predicted 3D structures. In parallel, a bidirectional long short-term memory neural networks (BiLSTM) are employed to capture long-range dependencies in the sequence. The learned low-dimensional representations by EGNN and BiLSTM are then combined into a multilayer perceptron for predicting B-cell epitopes. Through comprehensive tests on the curated epitope dataset, GraphBepi was shown to outperform the state-of-the-art methods by more than 5.5% and 44.0% in terms of AUC and AUPR, respectively. A web server is freely available at http://bio-web1.nscc-gz.cn/app/graphbepi.Availability and implementationThe datasets, pre-computed features, source codes, and the trained model are available at https://github.com/biomed-AI/GraphBepi.
Project description:CRISPR/Cas systems are popular genome editing tools that belong to a class of programmable nucleases and have enabled tremendous progress in the field of regenerative medicine. We here outline the structural and molecular frameworks of the well-characterized type II CRISPR system and several computational tools intended to facilitate experimental designs. The use of CRISPR tools to generate disease models has advanced research into the molecular aspects of disease conditions, including unraveling the molecular basis of immune rejection. Advances in regenerative medicine have been hindered by major histocompatibility complex-human leukocyte antigen (HLA) genes, which pose a major barrier to cell- or tissue-based transplantation. Based on progress in CRISPR, including in recent clinical trials, we hypothesize that the generation of universal donor immune-engineered stem cells is now a realistic approach to tackling a multitude of disease conditions.
Project description:Typically, component-oriented acausal hybrid modeling of complex dynamic systems is implemented by specialized modeling languages. A well-known example is the Modelica language. The specialized nature, complexity of implementation and learning of such languages somewhat limits their development and wide use by developers who know only general-purpose languages. The paper suggests the principle of developing simple to understand and modify Modelica-like system based on the general-purpose programming language Python. The principle consists in: (1) Python classes are used to describe components and their systems, (2) declarative symbolic tools SymPy are used to describe components behavior by difference or differential equations, (3) the solution procedure uses a function initially created using the SymPy lambdify function and computes unknown values in the current step using known values from the previous step, (4) Python imperative constructs are used for simple events handling, (5) external solvers of differential-algebraic equations can optionally be applied via the Assimulo interface, (6) SymPy package allows to arbitrarily manipulate model equations, generate code and solve some equations symbolically. The basic set of mechanical components (1D translational "mass", "spring-damper" and "force") is developed. The models of a sucker rods string are developed and simulated using these components. The comparison of results of the sucker rod string simulations with practical dynamometer cards and Modelica results verify the adequacy of the models. The proposed approach simplifies the understanding of the system, its modification and improvement, adaptation for other purposes, makes it available to a much larger community, simplifies integration into third-party software.