Dataset Information

Interrater variation in scoring radiological discrepancies.

ABSTRACT: OBJECTIVE: Discrepancy meetings are an important aspect of clinical governance. The Royal College of Radiologists has published advice on how to conduct meetings, suggesting that discrepancies are scored using the scale: 0=no error, 1=minor error, 2=moderate error and 3=major error. We have noticed variation in scores attributed to individual cases by radiologists and have sought to quantify the variation in scoring at our meetings. METHODS: The scores from six discrepancy meetings totalling 161 scored events were collected. The reliability of scoring was measured using Fleiss' kappa, which calculates the degree of agreement in classification. RESULTS: The number of cases rated at the six meetings ranged from 18 to 31 (mean 27). The number of raters ranged from 11 to 16 (mean 14). Only cases where all the raters scored were included in the analysis. The Fleiss' kappa statistic ranged from 0.12 to 0.20, and mean kappa was 0.17 for the six meetings. CONCLUSION: A kappa of 1.0 indicates perfect agreement above chance and 0.0 indicates agreement equal to chance. A rule of thumb is that a kappa ≥0.70 indicates adequate interrater agreement. Our mean result of 0.172 shows poor agreement between scorers. This could indicate a problem with the scoring system or may indicate a need for more formal training and agreement in how scores are applied. ADVANCES IN KNOWLEDGE: Scoring of radiology discrepancies is highly subjective and shows poor interrater agreement.

SUBMITTER: Mucci B

PROVIDER: S-EPMC3745061 | biostudies-other | 2013 Aug

REPOSITORIES: biostudies-other

ACCESS DATA

Similar Datasets

Project description:BackgroundGeographical variation in health care services challenges the basic principle of fair allocation of health care resources. This study aimed to investigate geographical variation in the use of X-ray, CT, MRI and Ultrasound examinations in Norway, the contribution from public and private institutions, and the impact of accessibility and socioeconomic factors on variation in examination rates.MethodsA nationwide survey of activity in all radiological institutions for the year 2002 was used to compare the rates per thousand of examinations in the counties. The data format was files/printouts where the examinations were recorded according to a code system.ResultsOverall rates per thousand of radiological examinations varied by a factor of 2.4. The use of MRI varied from 170 to 2, and CT from 216 to 56 examinations per 1000 inhabitants. Single MRI examinations (knee, cervical spine and head/brain) ranged high in variation, as did certain other spine examinations. For examination of specific organs, the counties' use of one modality was positively correlated with the use of other modalities. Private institutions accounted for 28% of all examinations, and tended towards performing a higher proportion of single examinations with high variability. Indicators of accessibility correlated positively to variation in examination rates, partly due to the figures from the county of Oslo. Correlations between examination rates and socioeconomic factors were also highly influenced by the figures from this county.ConclusionThe counties use of radiological services varied substantially, especially CT and MRI examinations. A likely cause of the variation is differences in accessibility. The coexistence of public and private institutions may be a source of variability, along with socioeconomic factors. The findings represent a challenge to the objective of equality in access to health care services, and indicate a potential for better allocation of overall health care resources.

Project description:Diagnostic data routinely collected for hospital admitted patients and used for case-mix adjustment in care provider comparisons and reimbursement are prone to biases. We aim to measure discrepancies, variations and associated factors in recorded chronic morbidities for hospital admitted patients in New South Wales (NSW), Australia. Of all admissions between July 2010 and June 2014 in all NSW public and private acute hospitals, admissions with over 24 hours stay and one or more of the chronic conditions of diabetes, smoking, hepatitis, HIV, and hypertension were included. The incidence of a non-recorded chronic condition in an admission occurring after the first admission with a recorded chronic condition (index admission) was considered as a discrepancy. Poisson models were employed to (i) derive adjusted discrepancy incidence rates (IR) and rate ratios (IRR) accounting for patient, admission, comorbidity and hospital characteristics and (ii) quantify variation in rates among hospitals. The discrepancy incidence rate was highest for hypertension (51% of 262,664 admissions), followed by hepatitis (37% of 12,107), smoking (33% of 548,965), HIV (27% of 1500) and diabetes (19% of 228,687). Adjusted rates for all conditions declined over the four-year period; with the sharpest drop of over 80% for diabetes (47.7% in 2010 vs. 7.3% in 2014), and 20% to 55% for the other conditions. Discrepancies were more common in private hospitals and smaller public hospitals. Inter-hospital differences were responsible for 1% (HIV) to 9.4% (smoking) of variation in adjusted discrepancy incidences, with an increasing trend for diabetes and HIV. Chronic conditions are recorded inconsistently in hospital administrative datasets, and hospitals contribute to the discrepancies. Adjustment for patterns and stratification in risk adjustments; and furthermore longitudinal accumulation of clinical data at patient level, refinement of clinical coding systems and standardisation of comorbidity recording across hospitals would enhance accuracy of datasets and validity of case-mix adjustment.

Project description:BackgroundSince the transfer and application of modern sequencing technologies to the analysis of amplified fragment-length polymorphisms (AFLP), evolutionary biologists have included an increasing number of samples and markers in their studies. Although justified in this context, the use of automated scoring procedures may result in technical biases that weaken the power and reliability of further analyses.ResultsUsing a new scoring algorithm, RawGeno, we show that scoring errors--in particular "bin oversplitting" (i.e. when variant sizes of the same AFLP marker are not considered as homologous) and "technical homoplasy" (i.e. when two AFLP markers that differ slightly in size are mistakenly considered as being homologous)--induce a loss of discriminatory power, decrease the robustness of results and, in extreme cases, introduce erroneous information in genetic structure analyses. In the present study, we evaluate several descriptive statistics that can be used to optimize the scoring of the AFLP analysis, and we describe a new statistic, the information content per bin (Ibin) that represents a valuable estimator during the optimization process. This statistic can be computed at any stage of the AFLP analysis without requiring the inclusion of replicated samples. Finally, we show that downstream analyses are not equally sensitive to scoring errors. Indeed, although a reasonable amount of flexibility is allowed during the optimization of the scoring procedure without causing considerable changes in the detection of genetic structure patterns, notable discrepancies are observed when estimating genetic diversities from differently scored datasets.ConclusionOur algorithm appears to perform as well as a commercial program in automating AFLP scoring, at least in the context of population genetics or phylogeographic studies. To our knowledge, RawGeno is the only freely available public-domain software for fully automated AFLP scoring, from electropherogram files to user-defined working binary matrices. RawGeno was implemented in an R CRAN package (with an user-friendly GUI) and can be found at http://sourceforge.net/projects/rawgeno.

Dataset Information

Interrater variation in scoring radiological discrepancies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets