Dataset Information

Measures of agreement between many raters for ordinal classifications.

ABSTRACT: Screening and diagnostic procedures often require a physician's subjective interpretation of a patient's test result using an ordered categorical scale to define the patient's disease severity. Because of wide variability observed between physicians' ratings, many large-scale studies have been conducted to quantify agreement between multiple experts' ordinal classifications in common diagnostic procedures such as mammography. However, very few statistical approaches are available to assess agreement in these large-scale settings. Many existing summary measures of agreement rely on extensions of Cohen's kappa. These are prone to prevalence and marginal distribution issues, become increasingly complex for more than three experts, or are not easily implemented. Here we propose a model-based approach to assess agreement in large-scale studies based upon a framework of ordinal generalized linear mixed models. A summary measure of agreement is proposed for multiple experts assessing the same sample of patients' test results according to an ordered categorical scale. This measure avoids some of the key flaws associated with Cohen's kappa and its extensions. Simulation studies are conducted to demonstrate the validity of the approach with comparison with commonly used agreement measures. The proposed methods are easily implemented using the software package R and are applied to two large-scale cancer agreement studies.

SUBMITTER: Nelson KP

PROVIDER: S-EPMC4560692 | biostudies-literature | 2015 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Measures of agreement between many raters for ordinal classifications.

Nelson Kerrie P KP Edwards Don D

Statistics in medicine 20150621 23

Screening and diagnostic procedures often require a physician's subjective interpretation of a patient's test result using an ordered categorical scale to define the patient's disease severity. Because of wide variability observed between physicians' ratings, many large-scale studies have been conducted to quantify agreement between multiple experts' ordinal classifications in common diagnostic procedures such as mammography. However, very few statistical approaches are available to assess agree ...[more]

PMID: 26095449

Dataset Information

Measures of agreement between many raters for ordinal classifications.

Publications

Measures of agreement between many raters for ordinal classifications.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Summary measures of agreement and association between many raters' ordinal classifications.
| S-EPMC5687310 | biostudies-literature

The impact of grey zones on the accuracy of agreement measures for ordinal tables.
| S-EPMC8048180 | biostudies-literature

Assessing the influence of rater and subject characteristics on measures of agreement for ordinal ratings.
| S-EPMC5540881 | biostudies-literature

Assessing alignment between functional markers and ordinal outcomes based on broad sense agreement.
| S-EPMC7138463 | biostudies-literature

Agreement between two raters' evaluation for integrated Traditional Prosthodontic Practical Exam with Directly Observed Procedural Skills in Egypt.
| S-EPMC6249138 | biostudies-literature

Assessing method agreement for paired repeated binary measurements administered by multiple raters.
| S-EPMC7233794 | biostudies-literature

Consistently High Agreement Between Independent Raters of Niemann-Pick Type C1 Clinical Severity Scale in Phase 2/3 Trial.
| S-EPMC8900058 | biostudies-literature

Agreement Lambda for Weighted Disagreement With Ordinal Scales: Correction for Category Prevalence.
| S-EPMC12602299 | biostudies-literature

Agreement in Measures of Macular Perfusion between Optical Coherence Tomography Angiography Machines.
| S-EPMC7239842 | biostudies-literature

An Extension of the Bland-Altman Plot for Analyzing the Agreement of More than Two Raters.
| S-EPMC7824071 | biostudies-literature