Project description:Aptamers are single-stranded nucleic acid ligands that bind to target molecules with high affinity and specificity. They are typically discovered by searching large libraries for sequences with desirable binding properties. These libraries, however, are practically constrained to a fraction of the theoretical sequence space. Machine learning provides an opportunity to intelligently navigate this space to identify high-performing aptamers. Here, we propose an approach that employs particle display (PD) to partition a library of aptamers by affinity, and uses such data to train machine learning models to predict affinity in silico. Our model predicted high-affinity DNA aptamers from experimental candidates at a rate 11-fold higher than random perturbation and generated novel, high-affinity aptamers at a greater rate than observed by PD alone. Our approach also facilitated the design of truncated aptamers 70% shorter and with higher binding affinity (1.5 nM) than the best experimental candidate. This work demonstrates how combining machine learning and physical approaches can be used to expedite the discovery of better diagnostic and therapeutic agents.
Project description:Biocatalytic reactions often require supplying chemical energy and phosphate groups in the form of adenosine triphosphate (ATP). Auxiliary enzymes can be used to convert a reaction by-product-adenosine diphosphate (ADP)-back to ATP. By employing real-time mass spectrometry (RTMS), one can gain an insight into inter-conversions of reactants in multi-enzyme reaction systems and optimize the reaction conditions. In this study, temporal traces of ions corresponding to adenosine monophosphate (AMP), ADP and ATP provided vital information that could be used to adjust activities of the 'buffering enzymes'. Using the RTMS results as a feedback, we also characterized a bienzymatic energy buffer that enables the recovery of ATP in the cases where it is directly hydrolysed to AMP in the main enzymatic reaction. The significance of careful selection of enzyme activities-guided by RTMS-is exemplified in the synthesis of glucose-6-phosphate by hexokinase in the presence of a buffering enzyme, pyruvate kinase. Relative activities of the two enzymes, present in the reaction mixture, influence biosynthetic reaction yields. This observation supports the conclusion that optimization of chemical energy recycling procedures is critical for the biosynthetic reaction economy.
Project description:On the basis of multidimensional and comprehensive molecular characterization (including DNA methalylation and copy number, RNA, and protein expression), we classified 894 renal cell carcinomas (RCCs) of various histologic types into nine major genomic subtypes. Site of origin within the nephron was one major determinant in the classification, reflecting differences among clear cell, chromophobe, and papillary RCC. Widespread molecular changes associated with TFE3 gene fusion or chromatin modifier genes were present within a specific subtype and spanned multiple subtypes. Differences in patient survival and in alteration of specific pathways (including hypoxia, metabolism, MAP kinase, NRF2-ARE, Hippo, immune checkpoint, and PI3K/AKT/mTOR) could further distinguish the subtypes. Immune checkpoint markers and molecular signatures of T cell infiltrates were both highest in the subtype associated with aggressive clear cell RCC. Differences between the genomic subtypes suggest that therapeutic strategies could be tailored to each RCC disease subset.
Project description:In this contribution, a semi-automatic segmentation algorithm for (medical) image analysis is presented. More precise, the approach belongs to the category of interactive contouring algorithms, which provide real-time feedback of the segmentation result. However, even with interactive real-time contouring approaches there are always cases where the user cannot find a satisfying segmentation, e.g. due to homogeneous appearances between the object and the background, or noise inside the object. For these difficult cases the algorithm still needs additional user support. However, this additional user support should be intuitive and rapid integrated into the segmentation process, without breaking the interactive real-time segmentation feedback. I propose a solution where the user can support the algorithm by an easy and fast placement of one or more seed points to guide the algorithm to a satisfying segmentation result also in difficult cases. These additional seed(s) restrict(s) the calculation of the segmentation for the algorithm, but at the same time, still enable to continue with the interactive real-time feedback segmentation. For a practical and genuine application in translational science, the approach has been tested on medical data from the clinical routine in 2D and 3D.
Project description:For many macromolecular assemblies, both a cryo-electron microscopy map and atomic structures of its component proteins are available. Here we describe a method for fitting and refining a component structure within its map at intermediate resolution (<15 A). The atomic positions are optimized with respect to a scoring function that includes the crosscorrelation coefficient between the structure and the map as well as stereochemical and nonbonded interaction terms. A heuristic optimization that relies on a Monte Carlo search, a conjugate-gradients minimization, and simulated annealing molecular dynamics is applied to a series of subdivisions of the structure into progressively smaller rigid bodies. The method was tested on 15 proteins of known structure with 13 simulated maps and 3 experimentally determined maps. At approximately 10 A resolution, Calpha rmsd between the initial and final structures was reduced on average by approximately 53%. The method is automated and can refine both experimental and predicted atomic structures.
Project description:Highly flexible proteins present a special challenge for structure determination because they are multi-structured yet not disordered, so their conformational ensembles are essential for understanding function. Because spectroscopic measurements of multiple conformational populations often provide sparse data, experiment selection is a limiting factor in conformational refinement. A molecular simulations- and information-theory based approach to select which experiments best refine conformational ensembles has been developed. This approach was tested on three flexible proteins. For proteins where a clear mechanistic hypothesis exists, experiments that test this hypothesis were systematically identified. When available data did not yield such mechanistic hypotheses, experiments that significantly outperform structure-guided approaches in conformational refinement were identified. This approach offers a particular advantage when refining challenging, underdetermined protein conformational ensembles.
Project description:Seven new species of the Neotropical hairstreak genus Oenomaus are described: Oenomaus mancha Busby & Faynel, sp. n. (type locality Ecuador); Oenomaus gwenish Robbins & Faynel, sp. n. (type locality Panama); Oenomaus lea Faynel & Robbins, sp. n. (type locality Ecuador); Oenomaus myrteana Busby, Robbins & Faynel, sp. n. (type locality Ecuador); Oenomaus mentirosa Faynel & Robbins, sp. n. (type locality Peru); Oenomaus andi Busby & Faynel, sp. n. (type locality Ecuador) and Oenomaus moseri Robbins & Faynel, sp. n. (type locality Brazil, Santa Catarina). For each new Oenomaus species, we present diagnostic characters and notes on its habitat and biology. We illustrate adults, genitalia, and distribution. New distributional and biological data are presented for 21 previously described Oenomaus species. Oenomaus melleus guyanensis Faynel, 2008 is treated as a new synonym of Oenomaus melleus melleus (Druce, 1907). Females are described and associated with males for ten species using a variety of factors, including mitochondrial COI DNA "barcode" sequences. We summarize the reasons why the number of recognized Oenomaus species has grown in the past decade from one species to 28 species. Finally, we overview the habitats that Oenomaus species occupy and note that the agricultural pest on Annonaceae, Oenomaus ortygnus, is the only Oenomaus species that regularly occurs in greatly disturbed habitats.
Project description:We develop a deep learning framework (DeepAccNet) that estimates per-residue accuracy and residue-residue distance signed error in protein models and uses these predictions to guide Rosetta protein structure refinement. The network uses 3D convolutions to evaluate local atomic environments followed by 2D convolutions to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models. Overall accuracy predictions for X-ray and cryoEM structures in the PDB correlate with their resolution, and the network should be broadly useful for assessing the accuracy of both predicted structure models and experimentally determined structures and identifying specific regions likely to be in error. Incorporation of the accuracy predictions at multiple stages in the Rosetta refinement protocol considerably increased the accuracy of the resulting protein structure models, illustrating how deep learning can improve search for global energy minima of biomolecules.
Project description:Driving molecular dynamics simulations with data-guided collective variables offer a promising strategy to recover thermodynamic information from structure-centric experiments. Here, the three-dimensional electron density of a protein, as it would be determined by cryo-EM or x-ray crystallography, is used to achieve simultaneously free-energy costs of conformational transitions and refined atomic structures. Unlike previous density-driven molecular dynamics methodologies that determine only the best map-model fits, our work employs the recently developed Multi-Map methodology to monitor concerted movements within equilibrium, non-equilibrium, and enhanced sampling simulations. Construction of all-atom ensembles along the chosen values of the Multi-Map variable enables simultaneous estimation of average properties, as well as real-space refinement of the structures contributing to such averages. Using three proteins of increasing size, we demonstrate that biased simulation along the reaction coordinates derived from electron densities can capture conformational transitions between known intermediates. The simulated pathways appear reversible with minimal hysteresis and require only low-resolution density information to guide the transition. The induced transitions also produce estimates for free energy differences that can be directly compared to experimental observables and population distributions. The refined model quality is superior compared to those found in the Protein Data Bank. We find that the best quantitative agreement with experimental free-energy differences is obtained using medium resolution density information coupled to comparatively large structural transitions. Practical considerations for probing the transitions between multiple intermediate density states are also discussed.