Unknown

Dataset Information

0

Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning.


ABSTRACT: Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.

SUBMITTER: Wu J 

PROVIDER: S-EPMC5546606 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

altmetric image

Publications

Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning.

Wu Jiayi J   Ma Yong-Bei YB   Congdon Charles C   Brett Bevin B   Chen Shuobing S   Xu Yaofang Y   Ouyang Qi Q   Mao Youdong Y  

PloS one 20170807 8


Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet dema  ...[more]

Similar Datasets

| EMPIAR-10069 | biostudies-other
| S-EPMC6567647 | biostudies-literature
| S-EPMC5154524 | biostudies-literature
| S-EPMC7906460 | biostudies-literature
| S-EPMC6770523 | biostudies-literature
| S-EPMC6995569 | biostudies-literature
| S-EPMC2765972 | biostudies-literature
| S-EPMC6009202 | biostudies-literature
| S-EPMC7611073 | biostudies-literature
| S-EPMC6717970 | biostudies-literature