Unknown

Dataset Information

0

Better models by discarding data?


ABSTRACT: In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC1/2, that can be used for this purpose were characterized and it was shown that CC1/2 has superior properties compared with `merging' R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC1/2 and the more conventional indicators were compared in two situations of practical importance: merging data sets from different crystals and selectively rejecting weak observations or (merged) unique reflections from a data set. In these situations controlled `paired-refinement' tests show that even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality. These results show the folly of such data-filtering practices aimed at improving the merging R values. Interestingly, in all of these tests CC1/2 is the one data-quality indicator for which the behaviour accurately reflects which of the alternative data-handling strategies results in the best-quality refined model. Its properties in the presence of systematic error are documented and discussed.

SUBMITTER: Diederichs K 

PROVIDER: S-EPMC3689524 | biostudies-literature | 2013 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Better models by discarding data?

Diederichs K K   Karplus P A PA  

Acta crystallographica. Section D, Biological crystallography 20130615 Pt 7


In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC1/2, that can be used for this purpose were characterized and it was shown that CC1/2 has superior properties compared with `merging' R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC1  ...[more]

Similar Datasets

| S-EPMC5961799 | biostudies-literature
| S-EPMC6911130 | biostudies-literature
| S-EPMC5415519 | biostudies-literature
2007-06-21 | GSE7763 | GEO
| S-EPMC4744765 | biostudies-literature
| S-EPMC1839111 | biostudies-literature
| S-EPMC6859306 | biostudies-literature
| S-EPMC6819607 | biostudies-literature
| S-EPMC3211246 | biostudies-literature
| S-EPMC6554483 | biostudies-literature