Unknown

Dataset Information

0

Predicting the performance of automated crystallographic model-building pipelines.


ABSTRACT: Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model.

SUBMITTER: Alharbi E 

PROVIDER: S-EPMC8647178 | biostudies-literature | 2021 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Predicting the performance of automated crystallographic model-building pipelines.

Alharbi Emad E   Bond Paul P   Calinescu Radu R   Cowtan Kevin K  

Acta crystallographica. Section D, Structural biology 20211129 Pt 12


Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipelin  ...[more]

Similar Datasets

| S-EPMC7466752 | biostudies-literature
| S-EPMC3606041 | biostudies-literature
| S-EPMC6505507 | biostudies-literature
| S-EPMC5770354 | biostudies-literature
| S-EPMC9435595 | biostudies-literature
| S-EPMC10877884 | biostudies-literature
| S-EPMC3499550 | biostudies-literature
| S-EPMC11006616 | biostudies-literature
| S-EPMC10245678 | biostudies-literature
| S-EPMC10167668 | biostudies-literature