Project description:Whole cell cryo-electron tomography emerges as an important component for structural system biology approaches. It allows the localization and structural characterization of macromolecular complexes in near living conditions. However, the method is hampered by low resolution, missing data and low signal-to-noise ratio (SNR). To overcome some of these difficulties one can align and average a large set of subtomograms. Existing alignment methods are mostly based on an exhaustive scanning and sampling of all but discrete relative rotations and translations of one subtomogram with respect to the other. In this paper, we propose a gradient-guided alignment method based on two subtomogram similarity measures. We also propose a stochastic parallel optimization that increases significantly the efficiency for the simultaneous refinement of a set of alignment candidates. Results on simulated data of model complexes and experimental structures of protein complexes show that even for highly distorted subtomograms and with only a small number of very sparsely distributed initial alignment seeds, our method can accurately recover true transformations with a significantly higher precision than scanning based alignment methods.
Project description:BackgroundCryo-electron tomography emerges as an important component for structural system biology. It not only allows the structural characterization of macromolecular complexes, but also the detection of their cellular localizations in near living conditions. However, the method is hampered by low resolution, missing data and low signal-to-noise ratio (SNR). To overcome some of these difficulties and enhance the nominal resolution one can align and average a large set of subtomograms. Existing methods for obtaining the optimal alignments are mostly based on an exhaustive scanning of all but discrete relative rigid transformations (i.e. rotations and translations) of one subtomogram with respect to the other.ResultsIn this paper, we propose gradient-guided alignment methods based on two popular subtomogram similarity measures, a real space as well as a Fourier-space constrained score. We also propose a stochastic parallel refinement method that increases significantly the efficiency for the simultaneous refinement of a set of alignment candidates. We estimate that our stochastic parallel refinement is on average about 20 to 40 fold faster in comparison to the standard independent refinement approach. Results on simulated data of model complexes and experimental structures of protein complexes show that even for highly distorted subtomograms and with only a small number of very sparsely distributed initial alignment seeds, our combined methods can accurately recover true transformations with a substantially higher precision than the scanning based alignment methods.ConclusionsOur methods increase significantly the efficiency and accuracy for subtomogram alignments, which is a key factor for the systematic classification of macromolecular complexes in cryo-electron tomograms of whole cells.
Project description:BackgroundCryo-electron tomography (Cryo-ET) is an imaging technique used to generate three-dimensional structures of cellular macromolecule complexes in their native environment. Due to developing cryo-electron microscopy technology, the image quality of three-dimensional reconstruction of cryo-electron tomography has greatly improved. However, cryo-ET images are characterized by low resolution, partial data loss and low signal-to-noise ratio (SNR). In order to tackle these challenges and improve resolution, a large number of subtomograms containing the same structure needs to be aligned and averaged. Existing methods for refining and aligning subtomograms are still highly time-consuming, requiring many computationally intensive processing steps (i.e. the rotations and translations of subtomograms in three-dimensional space).ResultsIn this article, we propose a Stochastic Average Gradient (SAG) fine-grained alignment method for optimizing the sum of dissimilarity measure in real space. We introduce a Message Passing Interface (MPI) parallel programming model in order to explore further speedup.ConclusionsWe compare our stochastic average gradient fine-grained alignment algorithm with two baseline methods, high-precision alignment and fast alignment. Our SAG fine-grained alignment algorithm is much faster than the two baseline methods. Results on simulated data of GroEL from the Protein Data Bank (PDB ID:1KP8) showed that our parallel SAG-based fine-grained alignment method could achieve close-to-optimal rigid transformations with higher precision than both high-precision alignment and fast alignment at a low SNR (SNR=0.003) with tilt angle range ±60∘ or ±40∘. For the experimental subtomograms data structures of GroEL and GroEL/GroES complexes, our parallel SAG-based fine-grained alignment can achieve higher precision and fewer iterations to converge than the two baseline methods.
Project description:Particle identification and selection, which is a prerequisite for high-resolution structure determination of biological macromolecules via single-particle cryo-electron microscopy poses a major bottleneck for automating the steps of structure determination. Here, we present a generalized deep learning tool, CASSPER, for the automated detection and isolation of protein particles in transmission microscope images. This deep learning tool uses Semantic Segmentation and a collection of visually prepared training samples to capture the differences in the transmission intensities of protein, ice, carbon, and other impurities found in the micrograph. CASSPER is a semantic segmentation based method that does pixel-level classification and completely eliminates the need for manual particle picking. Integration of Contrast Limited Adaptive Histogram Equalization (CLAHE) in CASSPER enables high-fidelity particle detection in micrographs with variable ice thickness and contrast. A generalized CASSPER model works with high efficiency on unseen datasets and can potentially pick particles on-the-fly, enabling data processing automation.
Project description:To solve three-dimensional structures of biological macromolecules in situ, large numbers of particles often need to be picked from cryo-electron tomograms. However, adoption of automated particle-picking methods remains limited because of their technical limitations. To overcome the limitations, we develop DeepETPicker, a deep learning model for fast and accurate picking of particles from cryo-electron tomograms. Training of DeepETPicker requires only weak supervision with low numbers of simplified labels, reducing the burden of manual annotation. The simplified labels combined with the customized and lightweight model architecture of DeepETPicker and accelerated pooling enable substantial performance improvement. When tested on simulated and real tomograms, DeepETPicker outperforms the competing state-of-the-art methods by achieving the highest overall accuracy and speed, which translate into higher authenticity and coordinates accuracy of picked particles and higher resolutions of final reconstruction maps. DeepETPicker is provided in open source with a user-friendly interface to support cryo-electron tomography in situ.
Project description:Deep learning has shown potential in domains with large-scale annotated datasets. However, manual annotation is expensive, time-consuming, and tedious. Pixel-level annotations are particularly costly for semantic segmentation in images with dense irregular patterns of object instances, such as in plant images. In this work, we propose a method for developing high-performing deep learning models for semantic segmentation of such images utilizing little manual annotation. As a use case, we focus on wheat head segmentation. We synthesize a computationally annotated dataset-using a few annotated images, a short unannotated video clip of a wheat field, and several video clips with no wheat-to train a customized U-Net model. Considering the distribution shift between the synthesized and real images, we apply three domain adaptation steps to gradually bridge the domain gap. Only using two annotated images, we achieved a Dice score of 0.89 on the internal test set. When further evaluated on a diverse external dataset collected from 18 different domains across five countries, this model achieved a Dice score of 0.73. To expose the model to images from different growth stages and environmental conditions, we incorporated two annotated images from each of the 18 domains to further fine-tune the model. This increased the Dice score to 0.91. The result highlights the utility of the proposed approach in the absence of large-annotated datasets. Although our use case is wheat head segmentation, the proposed approach can be extended to other segmentation tasks with similar characteristics of irregularly repeating patterns of object instances.
Project description:MotivationCryo-electron tomography allows the imaging of macromolecular complexes in near living conditions. To enhance the nominal resolution of a structure it is necessary to align and average individual subtomograms each containing identical complexes. However, if the sample of complexes is heterogeneous, it is necessary to first classify subtomograms into groups of identical complexes. This task becomes challenging when tomograms contain mixtures of unknown complexes extracted from a crowded environment. Two main challenges must be overcomed: First, classification of subtomograms must be performed without knowledge of template structures. However, most alignment methods are too slow to perform reference-free classification of a large number of (e.g. tens of thousands) of subtomograms. Second, subtomograms extracted from crowded cellular environments, contain often fragments of other structures besides the target complex. However, alignment methods generally assume that each subtomogram only contains one complex. Automatic methods are needed to identify the target complexes in a subtomogram even when its shape is unknown.ResultsIn this article, we propose an automatic and systematic method for the isolation and masking of target complexes in subtomograms extracted from crowded environments. Moreover, we also propose a fast alignment method using fast rotational matching in real space. Our experiments show that, compared with our previously proposed fast alignment method in reciprocal space, our new method significantly improves the alignment accuracy for highly distorted and especially crowded subtomograms. Such improvements are important for achieving successful and unbiased high-throughput reference-free structural classification of complexes inside whole-cell tomograms.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Cryogenic electron tomography (cryo-ET) has rapidly advanced as a high-resolution imaging tool for visualizing subcellular structures in 3D with molecular detail. Direct image inspection remains challenging due to inherent low signal-to-noise ratios (SNR). We introduce CryoSamba, a self-supervised deep learning-based model designed for denoising cryo-ET images. CryoSamba enhances single consecutive 2D planes in tomograms by averaging motion-compensated nearby planes through deep learning interpolation, effectively mimicking increased exposure. This approach amplifies coherent signals and reduces high-frequency noise, substantially improving tomogram contrast and SNR. CryoSamba operates on 3D volumes without needing pre-recorded images, synthetic data, labels or annotations, noise models, or paired volumes. CryoSamba suppresses high-frequency information less aggressively than do existing cryo-ET denoising methods, while retaining real information, as shown both by visual inspection and by Fourier shell correlation analysis of icosahedrally symmetric virus particles. Thus, CryoSamba enhances the analytical pipeline for direct 3D tomogram visual interpretation.
Project description:Cellular processes are governed by macromolecular complexes inside the cell. Study of the native structures of macromolecular complexes has been extremely difficult due to lack of data. With recent breakthroughs in Cellular Electron Cryo-Tomography (CECT) 3D imaging technology, it is now possible for researchers to gain accesses to fully study and understand the macro-molecular structures single cells. However, systematic recovery of macromolecular structures from CECT is very difficult due to high degree of structural complexity and practical imaging limitations. Specifically, we proposed a deep learning-based image classification approach for large-scale systematic macromolecular structure separation from CECT data. However, our previous work was only a very initial step toward exploration of the full potential of deep learning-based macromolecule separation. In this paper, we focus on improving classification performance by proposing three newly designed individual CNN models: an extended version of (Deep Small Receptive Field) DSRF3D, donated as DSRF3D-v2, a 3D residual block-based neural network, named as RB3D, and a convolutional 3D (C3D)-based model, CB3D. We compare them with our previously developed model (DSRF3D) on 12 datasets with different SNRs and tilt angle ranges. The experiments show that our new models achieved significantly higher classification accuracies. The accuracies are not only higher than 0.9 on normal datasets, but also demonstrate potentials to operate on datasets with high levels of noises and missing wedge effects presented.
Project description:MotivationCryo-Electron Tomography (cryo-ET) is a 3D imaging technology that enables the visualization of subcellular structures in situ at near-atomic resolution. Cellular cryo-ET images help in resolving the structures of macromolecules and determining their spatial relationship in a single cell, which has broad significance in cell and structural biology. Subtomogram classification and recognition constitute a primary step in the systematic recovery of these macromolecular structures. Supervised deep learning methods have been proven to be highly accurate and efficient for subtomogram classification, but suffer from limited applicability due to scarcity of annotated data. While generating simulated data for training supervised models is a potential solution, a sizeable difference in the image intensity distribution in generated data as compared with real experimental data will cause the trained models to perform poorly in predicting classes on real subtomograms.ResultsIn this work, we present Cryo-Shift, a fully unsupervised domain adaptation and randomization framework for deep learning-based cross-domain subtomogram classification. We use unsupervised multi-adversarial domain adaption to reduce the domain shift between features of simulated and experimental data. We develop a network-driven domain randomization procedure with 'warp' modules to alter the simulated data and help the classifier generalize better on experimental data. We do not use any labeled experimental data to train our model, whereas some of the existing alternative approaches require labeled experimental samples for cross-domain classification. Nevertheless, Cryo-Shift outperforms the existing alternative approaches in cross-domain subtomogram classification in extensive evaluation studies demonstrated herein using both simulated and experimental data.Availabilityand implementationhttps://github.com/xulabs/aitom.Supplementary informationSupplementary data are available at Bioinformatics online.