Dataset Information

Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies.

ABSTRACT:

Background

In January 2003, STAndards for the Reporting of Diagnostic accuracy studies (STARD) were published in a number of journals, to improve the quality of reporting in diagnostic accuracy studies. We designed a study to investigate the inter-assessment reproducibility, and intra- and inter-observer reproducibility of the items in the STARD statement.

Methods

Thirty-two diagnostic accuracy studies published in 2000 in medical journals with an impact factor of at least 4 were included. Two reviewers independently evaluated the quality of reporting of these studies using the 25 items of the STARD statement. A consensus evaluation was obtained by discussing and resolving disagreements between reviewers. Almost two years later, the same studies were evaluated by the same reviewers. For each item, percentages agreement and Cohen's kappa between first and second consensus assessments (inter-assessment) were calculated. Intraclass Correlation coefficients (ICC) were calculated to evaluate its reliability.

Results

The overall inter-assessment agreement for all items of the STARD statement was 85% (Cohen's kappa 0.70) and varied from 63% to 100% for individual items. The largest differences between the two assessments were found for the reporting of the rationale of the reference standard (kappa 0.37), number of included participants that underwent tests (kappa 0.28), distribution of the severity of the disease (kappa 0.23), a cross tabulation of the results of the index test by the results of the reference standard (kappa 0.33) and how indeterminate results, missing data and outliers were handled (kappa 0.25). Within and between reviewers, also large differences were observed for these items. The inter-assessment reliability of the STARD checklist was satisfactory (ICC = 0.79 [95% CI: 0.62 to 0.89]).

Conclusion

Although the overall reproducibility of the quality of reporting on diagnostic accuracy studies using the STARD statement was found to be good, substantial disagreements were found for specific items. These disagreements were not so much caused by differences in interpretation of the items by the reviewers but rather by difficulties in assessing the reporting of these items due to lack of clarity within the articles. Including a flow diagram in all reports on diagnostic accuracy studies would be very helpful in reducing confusion between readers and among reviewers.

SUBMITTER: Smidt N

PROVIDER: S-EPMC1522016 | biostudies-literature | 2006 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies.

Smidt Nynke N Rutjes Anne W S AW van der Windt Daniëlle A W M DA Ostelo Raymond W J G RW Bossuyt Patrick M PM Reitsma Johannes B JB Bouter Lex M LM de Vet Henrica C w HC

BMC medical research methodology 20060315

<h4>Background</h4>In January 2003, STAndards for the Reporting of Diagnostic accuracy studies (STARD) were published in a number of journals, to improve the quality of reporting in diagnostic accuracy studies. We designed a study to investigate the inter-assessment reproducibility, and intra- and inter-observer reproducibility of the items in the STARD statement.<h4>Methods</h4>Thirty-two diagnostic accuracy studies published in 2000 in medical journals with an impact factor of at least 4 were ...[more]

PMID: 16539705

Similar Datasets

Project description:ObjectivesTo investigate whether encouraging authors to follow the Standards for Reporting Diagnostic Accuracy (STARD) guidelines improves the quality of reporting of diagnostic accuracy studies.MethodsIn mid-2017, European Radiology started encouraging its authors to follow the STARD guidelines. Our MEDLINE search identified 114 diagnostic accuracy studies published in European Radiology in 2015 and 2019. The quality of reporting was evaluated by two independent reviewers using the revised STARD statement. Item 11 was excluded because a meaningful decision about adherence was not possible. Student's t test for independent samples was used to analyze differences in the mean number of reported STARD items between studies published in 2015 and in 2019. In addition, we calculated differences related to the study design, data collection, and citation rate.ResultsThe mean total number of reported STARD items for all 114 diagnostic accuracy studies analyzed was 15.9 ± 2.6 (54.8%) of 29 items (range 9.5-22.5). The quality of reporting of diagnostic accuracy studies was significantly better in 2019 (mean ± standard deviation (SD), 16.3 ± 2.7) than in 2015 (mean ± SD, 15.1 ± 2.3; p < 0.02). No significant differences in the reported STARD items were identified in relation to study design (p = 0.13), data collection (p = 0.87), and citation rate (p = 0.09).ConclusionThe quality of reporting of diagnostic accuracy studies according to the STARD statement was moderate with a slight improvement since European Radiology started to recommend its authors to follow the STARD guidelines.Key points• The quality of reporting of diagnostic accuracy studies was moderate with a mean total number of reported STARD items of 15.9 ± 2.6. • The adherence to STARD was significantly better in 2019 than in 2015 (16.3 ± 2.7 vs. 15.1 ± 2.3; p = 0.016). • No significant differences in the reported STARD items were identified in relation to study design (p = 0.13), data collection (p = 0.87), and citation rate (p = 0.09).

Project description:BackgroundResearch has shown a modest adherence of diagnostic test accuracy (DTA) studies in glaucoma to the Standards for Reporting of Diagnostic Accuracy Studies (STARD). We have applied the updated 30-item STARD 2015 checklist to a set of studies included in a Cochrane DTA systematic review of imaging tools for diagnosing manifest glaucoma.MethodsThree pairs of reviewers, including one senior reviewer who assessed all studies, independently checked the adherence of each study to STARD 2015. Adherence was analyzed on an individual-item basis. Logistic regression was used to evaluate the effect of publication year and impact factor on adherence.ResultsWe included 106 DTA studies, published between 2003-2014 in journals with a median impact factor of 2.6. Overall adherence was 54.1% for 3,286 individual rating across 31 items, with a mean of 16.8 (SD: 3.1; range 8-23) items per study. Large variability in adherence to reporting standards was detected across individual STARD 2015 items, ranging from 0 to 100%. Nine items (1: identification as diagnostic accuracy study in title/abstract; 6: eligibility criteria; 10: index test (a) and reference standard (b) definition; 12: cut-off definitions for index test (a) and reference standard (b); 14: estimation of diagnostic accuracy measures; 21a: severity spectrum of diseased; 23: cross-tabulation of the index and reference standard results) were adequately reported in more than 90% of the studies. Conversely, 10 items (3: scientific and clinical background of the index test; 11: rationale for the reference standard; 13b: blinding of index test results; 17: analyses of variability; 18; sample size calculation; 19: study flow diagram; 20: baseline characteristics of participants; 28: registration number and registry; 29: availability of study protocol; 30: sources of funding) were adequately reported in less than 30% of the studies. Only four items showed a statistically significant improvement over time: missing data (16), baseline characteristics of participants (20), estimates of diagnostic accuracy (24) and sources of funding (30).ConclusionsAdherence to STARD 2015 among DTA studies in glaucoma research is incomplete, and only modestly increasing over time.

Project description:IntroductionStandards for Reporting of Diagnostic Accuracy Study (STARD) was developed to improve the completeness and transparency of reporting in studies investigating diagnostic test accuracy. However, its current form, STARD 2015 does not address the issues and challenges raised by artificial intelligence (AI)-centred interventions. As such, we propose an AI-specific version of the STARD checklist (STARD-AI), which focuses on the reporting of AI diagnostic test accuracy studies. This paper describes the methods that will be used to develop STARD-AI.Methods and analysisThe development of the STARD-AI checklist can be distilled into six stages. (1) A project organisation phase has been undertaken, during which a Project Team and a Steering Committee were established; (2) An item generation process has been completed following a literature review, a patient and public involvement and engagement exercise and an online scoping survey of international experts; (3) A three-round modified Delphi consensus methodology is underway, which will culminate in a teleconference consensus meeting of experts; (4) Thereafter, the Project Team will draft the initial STARD-AI checklist and the accompanying documents; (5) A piloting phase among expert users will be undertaken to identify items which are either unclear or missing. This process, consisting of surveys and semistructured interviews, will contribute towards the explanation and elaboration document and (6) On finalisation of the manuscripts, the group's efforts turn towards an organised dissemination and implementation strategy to maximise end-user adoption.Ethics and disseminationEthical approval has been granted by the Joint Research Compliance Office at Imperial College London (reference number: 19IC5679). A dissemination strategy will be aimed towards five groups of stakeholders: (1) academia, (2) policy, (3) guidelines and regulation, (4) industry and (5) public and non-specific stakeholders. We anticipate that dissemination will take place in Q3 of 2021.

Dataset Information

Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies.

Background

Methods

Results

Conclusion

Publications

Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets