Project description:Although structures determined at near-atomic resolution are now routinely reported by cryo-electron microscopy (cryo-EM), many density maps are determined at an intermediate resolution, and extracting structure information from these maps is still a challenge. We report a computational method, Emap2sec, that identifies the secondary structures of proteins (α-helices, β-sheets and other structures) in EM maps at resolutions of between 5 and 10 Å. Emap2sec uses a three-dimensional deep convolutional neural network to assign secondary structure to each grid point in an EM map. We tested Emap2sec on EM maps simulated from 34 structures at resolutions of 6.0 and 10.0 Å, as well as on 43 maps determined experimentally at resolutions of between 5.0 and 9.5 Å. Emap2sec was able to clearly identify the secondary structures in many maps tested, and showed substantially better performance than existing methods.
Project description:Cryo-EM has emerged as the most important technique for structure determination of macromolecular complexes. However, raw cryo-EM maps often exhibit loss of contrast at high resolution and heterogeneity over the entire map. As such, various post-processing methods have been proposed to improve cryo-EM maps. Nevertheless, it is still challenging to improve both the quality and interpretability of EM maps. Addressing the challenge, we present a three-dimensional Swin-Conv-UNet-based deep learning framework to improve cryo-EM maps, named EMReady, by not only implementing both local and non-local modeling modules in a multiscale UNet architecture but also simultaneously minimizing the local smooth L1 distance and maximizing the non-local structural similarity between processed experimental and simulated target maps in the loss function. EMReady was extensively evaluated on diverse test sets of 110 primary cryo-EM maps and 25 pairs of half-maps at 3.0-6.0 Å resolutions, and compared with five state-of-the-art map post-processing methods. It is shown that EMReady can not only robustly enhance the quality of cryo-EM maps in terms of map-model correlations, but also improve the interpretability of the maps in automatic de novo model building.
Project description:Cryo-electron microscopy (cryo-EM) has become one of important experimental methods in structure determination. However, despite the rapid growth in the number of deposited cryo-EM maps motivated by advances in microscopy instruments and image processing algorithms, building accurate structure models for cryo-EM maps remains a challenge. Protein secondary structure information, which can be extracted from EM maps, is beneficial for cryo-EM structure modeling. Here, we present a novel secondary structure annotation framework for cryo-EM maps at both intermediate and high resolutions, named EMNUSS. EMNUSS adopts a three-dimensional (3D) nested U-net architecture to assign secondary structures for EM maps. Tested on three diverse datasets including simulated maps, middle resolution experimental maps, and high-resolution experimental maps, EMNUSS demonstrated its accuracy and robustness in identifying the secondary structures for cyro-EM maps of various resolutions. The EMNUSS program is freely available at http://huanglab.phys.hust.edu.cn/EMNUSS.
Project description:Cryo-electron microscopy (cryo-EM) has become a leading technology for determining protein structures. Recent advances in this field have allowed for atomic resolution. However, predicting the backbone trace of a protein has remained a challenge on all but the most pristine density maps (<2.5 Å resolution). Here we introduce a deep learning model that uses a set of cascaded convolutional neural networks (CNNs) to predict Cα atoms along a protein's backbone structure. The cascaded-CNN (C-CNN) is a novel deep learning architecture comprised of multiple CNNs, each predicting a specific aspect of a protein's structure. This model predicts secondary structure elements (SSEs), backbone structure, and Cα atoms, combining the results of each to produce a complete prediction map. The cascaded-CNN is a semantic segmentation image classifier and was trained using thousands of simulated density maps. This method is largely automatic and only requires a recommended threshold value for each protein density map. A specialized tabu-search path walking algorithm was used to produce an initial backbone trace with Cα placements. A helix-refinement algorithm made further improvements to the α-helix SSEs of the backbone trace. Finally, a novel quality assessment-based combinatorial algorithm was used to effectively map protein sequences onto Cα traces to obtain full-atom protein structures. This method was tested on 50 experimental maps between 2.6 Å and 4.4 Å resolution. It outperformed several state-of-the-art prediction methods including Rosetta de-novo, MAINMAST, and a Phenix based method by producing the most complete predicted protein structures, as measured by percentage of found Cα atoms. This method accurately predicted 88.9% (mean) of the Cα atoms within 3 Å of a protein's backbone structure surpassing the 66.8% mark achieved by the leading alternate method (Phenix based fully automatic method) on the same set of density maps. The C-CNN also achieved an average root-mean-square deviation (RMSD) of 1.24 Å on a set of 50 experimental density maps which was tested by the Phenix based fully automatic method. The source code and demo of this research has been published at https://github.com/DrDongSi/Ca-Backbone-Prediction.
Project description:Deep learning has emerged as the technique of choice for identifying hidden patterns in cell imaging data but is often criticized as "black box." Here, we employ a generative neural network in combination with supervised machine learning to classify patient-derived melanoma xenografts as "efficient" or "inefficient" metastatic, validate predictions regarding melanoma cell lines with unknown metastatic efficiency in mouse xenografts, and use the network to generate in silico cell images that amplify the critical predictive cell properties. These exaggerated images unveiled pseudopodial extensions and increased light scattering as hallmark properties of metastatic cells. We validated this interpretation using live cells spontaneously transitioning between states indicative of low and high metastatic efficiency. This study illustrates how the application of artificial intelligence can support the identification of cellular properties that are predictive of complex phenotypes and integrated cell functions but are too subtle to be identified in the raw imagery by a human expert. A record of this paper's transparent peer review process is included in the supplemental information. VIDEO ABSTRACT.
Project description:Measurement of the width of fetal lateral ventricles (LVs) in prenatal ultrasound (US) images is essential for antenatal neuronographic assessment. However, the manual measurement of LV width is highly subjective and relies on the clinical experience of scanners. To deal with this challenge, we propose a computer-aided detection framework for automatic measurement of fetal LVs in two-dimensional US images. First, we train a deep convolutional network on 2,400 images of LVs to perform pixel-wise segmentation. Then, the number of pixels per centimeter (PPC), a vital parameter for quantifying the caliper in US images, is obtained via morphological operations guided by prior knowledge. The estimated PPC, upon conversion to a physical length, is used to determine the diameter of the LV by employing the minimum enclosing rectangle method. Extensive experiments on a self-collected dataset demonstrate that the proposed method achieves superior performance over manual measurement, with a mean absolute measurement error of 1.8 mm. The proposed method is fully automatic and is shown to be capable of reducing measurement bias caused by improper US scanning.
Project description:The forces exerted by single cells in the three-dimensional (3D) environments play a crucial role in modulating cellular functions and behaviors closely related to physiological and pathological processes. Cellular force microscopy (CFM) provides a feasible solution for quantifying mechanical interactions, which usually regains cellular forces from deformation information of extracellular matrices embedded with fluorescent beads. Owing to computational complexity, traditional 3D-CFM is usually extremely time consuming, which makes it challenging for efficient force recovery and large-scale sample analysis. With the aid of deep neural networks, this study puts forward a novel, data-driven 3D-CFM to reconstruct 3D cellular force fields directly from volumetric images with random fluorescence patterns. The deep-learning-based network is established through stacking deep convolutional neural networks (DCNN) and specific function layers. Some necessary physical information associated with constitutive relation of extracellular matrix material is coupled to the data-driven network. The mini-batch stochastic-gradient-descent and back-propagation algorithms are introduced to ensure its convergence and training efficiency. The networks not only have good generalization ability and robustness but also can recover 3D cellular forces directly from the input fluorescence image pairs. Particularly, the computational efficiency of the deep-learning-based network is at least one to two orders of magnitude higher than that of traditional 3D-CFM. This study provides a novel scheme for developing high-performance 3D-CFM to quantitatively characterize mechanical interactions between single cells and surrounding extracellular matrices, which is of vital importance for quantitative investigations in biomechanics and mechanobiology.
Project description:An increasing number of density maps of macromolecular structures, including proteins and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, where extracting structure information is not trivial. Here, we report a new computational method, Emap2sec+, which identifies DNA or RNA as well as the secondary structures of proteins in cryo-EM maps of 5 to 10 Å resolution. Emap2sec+ employs the deep Residual convolutional neural network. Emap2sec+ assigns structural labels with associated probabilities at each voxel in a cryo-EM map, which will help structure modeling in an EM map. Emap2sec+ showed stable and high assignment accuracy for nucleotides in low resolution maps and improved performance for protein secondary structure assignments than its earlier version when tested on simulated and experimental maps.
Project description:Crops require appropriate planting techniques at different growth stages. Judgments on crop maturity affect the yield of crops. The planting and management of crops rely heavily on experienced farmers, which can reduce planting costs and increase yields. With the advancement of smart agriculture [1], images of crops can be used to accurately determine the growth stage of crops and estimate crop yields [2]. This can be combined with drones or smartphones to predict the growth stage and yield of Fortunella margarita for farmers in the future. This article presents an F. margarita image dataset. We classified F. margarita into three growth stages: mature, immature, and growing. In this dataset, an image may contain plants in several growth stages. The images were divided into seven categories according to growth stage. The dataset contains a total of 1031 original images. The total number of images was increased to 6611 through data augmentation. In addition, the dataset includes 6611 annotations with 7 categories of manually marked positions of F. margarita. Field images were captured in Jiaoxi, Yilan County, Taiwan, using smartphones. The dataset can serve as a resource for researchers who use different algorithms of machine learning or deep learning for object detection, image segmentation, and multiclass classification.