Unknown

Dataset Information

0

New method to compute Rcomplete enables maximum likelihood refinement for small datasets.


ABSTRACT: The crystallographic reliability index [Formula: see text] is based on a method proposed more than two decades ago. Because its calculation is computationally expensive its use did not spread into the crystallographic community in favor of the cross-validation method known as [Formula: see text]. The importance of [Formula: see text] has grown beyond a pure validation tool. However, its application requires a sufficiently large dataset. In this work we assess the reliability of [Formula: see text] and we compare it with k-fold cross-validation, bootstrapping, and jackknifing. As opposed to proper cross-validation as realized with [Formula: see text], [Formula: see text] relies on a method of reducing bias from the structural model. We compare two different methods reducing model bias and question the widely spread notion that random parameter shifts are required for this purpose. We show that [Formula: see text] has as little statistical bias as [Formula: see text] with the benefit of a much smaller variance. Because the calculation of [Formula: see text] is based on the entire dataset instead of a small subset, it allows the estimation of maximum likelihood parameters even for small datasets. [Formula: see text] enables maximum likelihood-based refinement to be extended to virtually all areas of crystallographic structure determination including high-pressure studies, neutron diffraction studies, and datasets from free electron lasers.

SUBMITTER: Luebben J 

PROVIDER: S-EPMC4517205 | biostudies-literature | 2015 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

New method to compute Rcomplete enables maximum likelihood refinement for small datasets.

Luebben Jens J   Gruene Tim T  

Proceedings of the National Academy of Sciences of the United States of America 20150706 29


The crystallographic reliability index [Formula: see text] is based on a method proposed more than two decades ago. Because its calculation is computationally expensive its use did not spread into the crystallographic community in favor of the cross-validation method known as [Formula: see text]. The importance of [Formula: see text] has grown beyond a pure validation tool. However, its application requires a sufficiently large dataset. In this work we assess the reliability of [Formula: see tex  ...[more]

Similar Datasets

| S-EPMC5892877 | biostudies-literature
2024-03-20 | GSE261769 | GEO
| S-EPMC4257616 | biostudies-literature
| S-EPMC1323467 | biostudies-literature
| S-EPMC7454987 | biostudies-literature
| S-EPMC10427019 | biostudies-literature
| S-EPMC3313049 | biostudies-literature
| S-EPMC2792768 | biostudies-literature
| S-EPMC6371681 | biostudies-literature
| S-EPMC7499257 | biostudies-literature