Dataset Information

Diagnostic mammography: identifying minimally acceptable interpretive performance criteria.

ABSTRACT: To develop criteria to identify thresholds for the minimally acceptable performance of physicians interpreting diagnostic mammography studies.In an institutional review board-approved HIPAA-compliant study, an Angoff approach was used to set criteria for identifying minimally acceptable interpretive performance for both workup after abnormal screening examinations and workup of a breast lump. Normative data from the Breast Cancer Surveillance Consortium (BCSC) was used to help the expert radiologist identify the impact of cut points. Simulations, also using data from the BCSC, were used to estimate the expected clinical impact from the recommended performance thresholds.Final cut points for workup of abnormal screening examinations were as follows: sensitivity, less than 80%; specificity, less than 80% or greater than 95%; abnormal interpretation rate, less than 8% or greater than 25%; positive predictive value (PPV) of biopsy recommendation (PPV2), less than 15% or greater than 40%; PPV of biopsy performed (PPV3), less than 20% or greater than 45%; and cancer diagnosis rate, less than 20 per 1000 interpretations. Final cut points for workup of a breast lump were as follows: sensitivity, less than 85%; specificity, less than 83% or greater than 95%; abnormal interpretation rate, less than 10% or greater than 25%; PPV2, less than 25% or greater than 50%; PPV3, less than 30% or greater than 55%; and cancer diagnosis rate, less than 40 per 1000 interpretations. If underperforming physicians moved into the acceptable range after remedial training, the expected result would be (a) diagnosis of an additional 86 cancers per 100,000 women undergoing workup after screening examinations, with a reduction in the number of false-positive examinations by 1067 per 100,000 women undergoing this workup, and (b) diagnosis of an additional 335 cancers per 100,000 women undergoing workup of a breast lump, with a reduction in the number of false-positive examinations by 634 per 100,000 women undergoing this workup.Interpreting physicians who fall outside one or more of the identified cut points should be reviewed in the context of an overall assessment of all their performance measures and their specific practice setting to determine if remedial training is indicated.

SUBMITTER: Carney PA

PROVIDER: S-EPMC3632803 | biostudies-other | 2013 May

REPOSITORIES: biostudies-other

ACCESS DATA

Similar Datasets

Project description:PurposeTo examine whether U.S. radiologists' interpretive volume affects their screening mammography performance.Materials and methodsAnnual interpretive volume measures (total, screening, diagnostic, and screening focus [ratio of screening to diagnostic mammograms]) were collected for 120 radiologists in the Breast Cancer Surveillance Consortium (BCSC) who interpreted 783 965 screening mammograms from 2002 to 2006. Volume measures in 1 year were examined by using multivariate logistic regression relative to screening sensitivity, false-positive rates, and cancer detection rate the next year. BCSC registries and the Statistical Coordinating Center received institutional review board approval for active or passive consenting processes and a Federal Certificate of Confidentiality and other protections for participating women, physicians, and facilities. All procedures were compliant with the terms of the Health Insurance Portability and Accountability Act.ResultsMean sensitivity was 85.2% (95% confidence interval [CI]: 83.7%, 86.6%) and was significantly lower for radiologists with a greater screening focus (P = .023) but did not significantly differ by total (P = .47), screening (P = .33), or diagnostic (P = .23) volume. The mean false-positive rate was 9.1% (95% CI: 8.1%, 10.1%), with rates significantly higher for radiologists who had the lowest total (P = .008) and screening (P = .015) volumes. Radiologists with low diagnostic volume (P = .004 and P = .008) and a greater screening focus (P = .003 and P = .002) had significantly lower false-positive and cancer detection rates, respectively. Median invasive tumor size and proportion of cancers detected at early stages did not vary by volume.ConclusionIncreasing minimum interpretive volume requirements in the United States while adding a minimal requirement for diagnostic interpretation could reduce the number of false-positive work-ups without hindering cancer detection. These results provide detailed associations between mammography volumes and performance for policymakers to consider along with workforce, practice organization, and access issues and radiologist experience when reevaluating requirements.

Project description:PurposeTo examine radiologists' screening performance in relation to the number of diagnostic work-ups performed after abnormal findings are discovered at screening mammography by the same radiologist or by different radiologists.Materials and methodsIn an institutional review board-approved HIPAA-compliant study, the authors linked 651 671 screening mammograms interpreted from 2002 to 2006 by 96 radiologists in the Breast Cancer Surveillance Consortium to cancer registries (standard of reference) to evaluate the performance of screening mammography (sensitivity, false-positive rate [ FPR false-positive rate ], and cancer detection rate [ CDR cancer detection rate ]). Logistic regression was used to assess the association between the volume of recalled screening mammograms ("own" mammograms, where the radiologist who interpreted the diagnostic image was the same radiologist who had interpreted the screening image, and "any" mammograms, where the radiologist who interpreted the diagnostic image may or may not have been the radiologist who interpreted the screening image) and screening performance and whether the association between total annual volume and performance differed according to the volume of diagnostic work-up.ResultsAnnually, 38% of radiologists performed the diagnostic work-up for 25 or fewer of their own recalled screening mammograms, 24% performed the work-up for 0-50, and 39% performed the work-up for more than 50. For the work-up of recalled screening mammograms from any radiologist, 24% of radiologists performed the work-up for 0-50 mammograms, 32% performed the work-up for 51-125, and 44% performed the work-up for more than 125. With increasing numbers of radiologist work-ups for their own recalled mammograms, the sensitivity (P = .039), FPR false-positive rate (P = .004), and CDR cancer detection rate (P < .001) of screening mammography increased, yielding a stepped increase in women recalled per cancer detected from 17.4 for 25 or fewer mammograms to 24.6 for more than 50 mammograms. Increases in work-ups for any radiologist yielded significant increases in FPR false-positive rate (P = .011) and CDR cancer detection rate (P = .001) and a nonsignificant increase in sensitivity (P = .15). Radiologists with a lower annual volume of any work-ups had consistently lower FPR false-positive rate , sensitivity, and CDR cancer detection rate at all annual interpretive volumes.ConclusionThese findings support the hypothesis that radiologists may improve their screening performance by performing the diagnostic work-up for their own recalled screening mammograms and directly receiving feedback afforded by means of the outcomes associated with their initial decision to recall. Arranging for radiologists to work up a minimum number of their own recalled cases could improve screening performance but would need systems to facilitate this workflow.

Project description:Purpose To establish contemporary performance benchmarks for diagnostic digital mammography with use of recent data from the Breast Cancer Surveillance Consortium (BCSC). Materials and Methods Institutional review board approval was obtained for active or passive consenting processes or to obtain a waiver of consent to enroll participants, link data, and perform analyses. Data were obtained from six BCSC registries (418 radiologists, 92 radiology facilities). Mammogram indication and assessments were prospectively collected for women undergoing diagnostic digital mammography and linked with cancer diagnoses from state cancer registries. The study included 401 548 examinations conducted from 2007 to 2013 in 265 360 women. Results Overall diagnostic performance measures were as follows: cancer detection rate, 34.7 per 1000 (95% confidence interval [CI]: 34.1, 35.2); abnormal interpretation rate, 12.6% (95% CI: 12.5%, 12.7%); positive predictive value (PPV) of a biopsy recommendation (PPV2), 27.5% (95% CI: 27.1%, 27.9%); PPV of biopsies performed (PPV3), 30.4% (95% CI: 29.9%, 30.9%); false-negative rate, 4.8 per 1000 (95% CI: 4.6, 5.0); sensitivity, 87.8% (95% CI: 87.3%, 88.4%); and specificity, 90.5% (95% CI: 90.4%, 90.6%). Among cancers detected, 63.4% were stage 0 or 1 cancers, 45.6% were minimal cancers, the mean size of invasive cancers was 21.2 mm, and 69.6% of invasive cancers were node negative. Performance metrics varied widely across diagnostic indications, with cancer detection rate (64.5 per 1000) and abnormal interpretation rate (18.7%) highest for diagnostic mammograms obtained to evaluate a breast problem with a lump. Compared with performance during the screen-film mammography era, diagnostic digital performance showed increased abnormal interpretation and cancer detection rates and decreasing PPVs, with less than 70% of radiologists within acceptable ranges for PPV2 and PPV3. Conclusion These performance measures can serve as national benchmarks that may help transform the marked variation in radiologists' diagnostic performance into targeted quality improvement efforts. © RSNA, 2017 Online supplemental material is available for this article.

Project description:The objective of this study was to determine whether multi-microRNA analysis using a combination of four microRNA biomarkers (miR-1246, 202, 21, and 219B) could improve the diagnostic performance of mammography in determining breast cancer risk by age group (under 50 vs. over 50) and distinguish breast cancer from benign breast diseases and other cancers (thyroid, colon, stomach, lung, liver, and cervix cancers). To verify breast cancer classification performance of the four miRNA biomarkers and whether the model providing breast cancer risk score could distinguish between benign breast disease and other cancers, the model was verified using nonlinear support vector machine (SVM) and generalized linear model (GLM) and age and four miRNA qRT-PCR analysis values (dCt) were input to these models. Breast cancer risk scores for each Breast Imaging-Reporting and Data System (BI-RADS) category in multi-microRNA analysis were analyzed to examine the correlation between breast cancer risk scores and mammography categories. We generated two models using two classification algorithms, SVM and GLM, with a combination of four miRNA biomarkers showing high performance and sensitivities of 84.5% and 82.1%, a specificity of 85%, and areas under the curve (AUCs) of 0.967 and 0.965, respectively, which showed consistent performance across all stages of breast cancer and patient ages. The results of this study showed that this multi-microRNA analysis using the four miRNA biomarkers was effective in classifying breast cancer in patients under the age of 50, which is challenging to accurately diagnose. In addition, breast cancer and benign breast diseases can be classified, showing the possibility of helping with diagnosis by mammography. Verification of the performance of the four miRNA biomarkers confirmed that multi-microRNA analysis could be used as a new breast cancer screening aid to improve the accuracy of mammography. However, many factors must be considered for clinical use. Further validation with an appropriate screening population in large clinical trials is required. This trial is registered with (KNUCH 2022-04-036).

Dataset Information

Diagnostic mammography: identifying minimally acceptable interpretive performance criteria.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets