Dataset Information

A curated mammography data set for use in computer-aided detection and diagnosis research.

ABSTRACT: Published research results are difficult to replicate due to the lack of a standard evaluation data set in the area of decision support systems in mammography; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. This causes an inability to directly compare the performance of methods or to replicate prior results. We seek to resolve this substantial challenge by releasing an updated and standardized version of the Digital Database for Screening Mammography (DDSM) for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography. Our data set, the CBIS-DDSM (Curated Breast Imaging Subset of DDSM), includes decompressed images, data selection and curation by trained mammographers, updated mass segmentation and bounding boxes, and pathologic diagnosis for training data, formatted similarly to modern computer vision data sets. The data set contains 753 calcification cases and 891 mass cases, providing a data-set size capable of analyzing decision support systems in mammography.

SUBMITTER: Lee RS

PROVIDER: S-EPMC5735920 | biostudies-literature | 2017 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A curated mammography data set for use in computer-aided detection and diagnosis research.

Lee Rebecca Sawyer RS Gimenez Francisco F Hoogi Assaf A Miyake Kanae Kawai KK Gorovoy Mia M Rubin Daniel L DL

Scientific data 20171219

Published research results are difficult to replicate due to the lack of a standard evaluation data set in the area of decision support systems in mammography; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. This causes an inability to directly compare the performance of methods or to replicate prior results. We seek to resolve this substantial challenge by rele ...[more]

PMID: 29257132

Similar Datasets

Project description:BackgroundComputer-aided detection identifies suspicious findings on mammograms to assist radiologists. Since the Food and Drug Administration approved the technology in 1998, it has been disseminated into practice, but its effect on the accuracy of interpretation is unclear.MethodsWe determined the association between the use of computer-aided detection at mammography facilities and the performance of screening mammography from 1998 through 2002 at 43 facilities in three states. We had complete data for 222,135 women (a total of 429,345 mammograms), including 2351 women who received a diagnosis of breast cancer within 1 year after screening. We calculated the specificity, sensitivity, and positive predictive value of screening mammography with and without computer-aided detection, as well as the rates of biopsy and breast-cancer detection and the overall accuracy, measured as the area under the receiver-operating-characteristic (ROC) curve.ResultsSeven facilities (16%) implemented computer-aided detection during the study period. Diagnostic specificity decreased from 90.2% before implementation to 87.2% after implementation (P<0.001), the positive predictive value decreased from 4.1% to 3.2% (P=0.01), and the rate of biopsy increased by 19.7% (P<0.001). The increase in sensitivity from 80.4% before implementation of computer-aided detection to 84.0% after implementation was not significant (P=0.32). The change in the cancer-detection rate (including invasive breast cancers and ductal carcinomas in situ) was not significant (4.15 cases per 1000 screening mammograms before implementation and 4.20 cases after implementation, P=0.90). Analyses of data from all 43 facilities showed that the use of computer-aided detection was associated with significantly lower overall accuracy than was nonuse (area under the ROC curve, 0.871 vs. 0.919; P=0.005).ConclusionsThe use of computer-aided detection is associated with reduced accuracy of interpretation of screening mammograms. The increased rate of biopsy with the use of computer-aided detection is not clearly associated with improved detection of invasive breast cancer.

Project description:ObjectiveTo test the performance of an artificial intelligence-based computer-aided diagnosis (AI-CAD) designed for full-field digital mammography (FFDM) when applied to synthetic mammography (SM).Materials and methodsWe analyzed 501 women (mean age, 57 ± 11 years) who underwent preoperative mammography and breast cancer surgery. This cohort consisted of 1002 breasts, comprising 517 with cancer and 485 without. All patients underwent digital breast tomosynthesis (DBT) and FFDM during the preoperative workup. The SM is routinely reconstructed using DBT. Commercial AI-CAD (Lunit Insight MMG, version 1.1.7.2) was retrospectively applied to SM and FFDM to calculate the abnormality scores for each breast. The median abnormality scores were compared for the 517 breasts with cancer using the Wilcoxon signed-rank test. Calibration curves of abnormality scores were evaluated. The discrimination performance was analyzed using the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity using a 10% preset threshold. Sensitivity and specificity were further analyzed according to the mammographic and pathological characteristics. The results of SM and FFDM were compared.ResultsAI-CAD demonstrated a significantly lower median abnormality score (71% vs. 96%, P < 0.001) and poorer calibration performance for SM than for FFDM. SM exhibited lower sensitivity (76.2% vs. 82.8%, P < 0.001), higher specificity (95.5% vs. 91.8%, P < 0.001), and comparable AUC (0.86 vs. 0.87, P = 0.127) than FFDM. SM showed lower sensitivity than FFDM in asymptomatic breasts, dense breasts, ductal carcinoma in situ, T1, N0, and hormone receptor-positive/human epidermal growth factor receptor 2-negative cancers but showed higher specificity in non-cancerous dense breasts.ConclusionAI-CAD showed lower abnormality scores and reduced calibration performance for SM than for FFDM. Furthermore, the 10% preset threshold resulted in different discrimination performances for the SM. Given these limitations, off-label application of the current AI-CAD to SM should be avoided.

Project description:ImportanceAfter the US Food and Drug Administration (FDA) approved computer-aided detection (CAD) for mammography in 1998, and the Centers for Medicare and Medicaid Services (CMS) provided increased payment in 2002, CAD technology disseminated rapidly. Despite sparse evidence that CAD improves accuracy of mammographic interpretations and costs over $400 million a year, CAD is currently used for most screening mammograms in the United States.ObjectiveTo measure performance of digital screening mammography with and without CAD in US community practice.Design, setting, and participantsWe compared the accuracy of digital screening mammography interpreted with (n = 495 818) vs without (n = 129 807) CAD from 2003 through 2009 in 323 973 women. Mammograms were interpreted by 271 radiologists from 66 facilities in the Breast Cancer Surveillance Consortium. Linkage with tumor registries identified 3159 breast cancers in 323 973 women within 1 year of the screening.Main outcomes and measuresMammography performance (sensitivity, specificity, and screen-detected and interval cancers per 1000 women) was modeled using logistic regression with radiologist-specific random effects to account for correlation among examinations interpreted by the same radiologist, adjusting for patient age, race/ethnicity, time since prior mammogram, examination year, and registry. Conditional logistic regression was used to compare performance among 107 radiologists who interpreted mammograms both with and without CAD.ResultsScreening performance was not improved with CAD on any metric assessed. Mammography sensitivity was 85.3% (95% CI, 83.6%-86.9%) with and 87.3% (95% CI, 84.5%-89.7%) without CAD. Specificity was 91.6% (95% CI, 91.0%-92.2%) with and 91.4% (95% CI, 90.6%-92.0%) without CAD. There was no difference in cancer detection rate (4.1 in 1000 women screened with and without CAD). Computer-aided detection did not improve intraradiologist performance. Sensitivity was significantly decreased for mammograms interpreted with vs without CAD in the subset of radiologists who interpreted both with and without CAD (odds ratio, 0.53; 95% CI, 0.29-0.97).Conclusions and relevanceComputer-aided detection does not improve diagnostic accuracy of mammography. These results suggest that insurers pay more for CAD with no established benefit to women.

Project description:To improve the performance of a computer-aided detection (CAD) system for mass detection by using four-view information in screening mammography.The authors developed a four-view CAD system that emulates radiologists' reading by using the craniocaudal and mediolateral oblique views of the ipsilateral breast to reduce false positives (FPs) and the corresponding views of the contralateral breast to detect asymmetry. The CAD system consists of four major components: (1) Initial detection of breast masses on individual views, (2) information fusion of the ipsilateral views of the breast (referred to as two-view analysis), (3) information fusion of the corresponding views of the contralateral breast (referred to as bilateral analysis), and (4) fusion of the four-view information with a decision tree. The authors collected two data sets for training and testing of the CAD system: A mass set containing 389 patients with 389 biopsy-proven masses and a normal set containing 200 normal subjects. All cases had four-view mammograms. The true locations of the masses on the mammograms were identified by an experienced MQSA radiologist. The authors randomly divided the mass set into two independent sets for cross validation training and testing. The overall test performance was assessed by averaging the free response receiver operating characteristic (FROC) curves of the two test subsets. The FP rates during the FROC analysis were estimated by using the normal set only. The jackknife free-response ROC (JAFROC) method was used to estimate the statistical significance of the difference between the test FROC curves obtained with the single-view and the four-view CAD systems.Using the single-view CAD system, the breast-based test sensitivities were 58% and 77% at the FP rates of 0.5 and 1.0 per image, respectively. With the four-view CAD system, the breast-based test sensitivities were improved to 76% and 87% at the corresponding FP rates, respectively. The improvement was found to be statistically significant (p < 0.0001) by JAFROC analysis.The four-view information fusion approach that emulates radiologists' reading strategy significantly improves the performance of breast mass detection of the CAD system in comparison with the single-view approach.

Project description:PurposeTo compare machine learning methods for classifying mass lesions on mammography images that use predefined image features computed over lesion segmentations to those that leverage segmentation-free representation learning on a standard, public evaluation dataset.MethodsWe apply several classification algorithms to the public Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM), in which each image contains a mass lesion. Segmentation-free representation learning techniques for classifying lesions as benign or malignant include both a Bag-of-Visual-Words (BoVW) method and a Convolutional Neural Network (CNN). We compare classification performance of these techniques to that obtained using two different segmentation-dependent approaches from the literature that rely on specific combinations of end classifiers (e.g. linear discriminant analysis, neural networks) and predefined features computed over the lesion segmentation (e.g. spiculation measure, morphological characteristics, intensity metrics).ResultsWe report area under the receiver operating characteristic curve (AZ) values for malignancy classification on CBIS-DDSM for each technique. We find average AZ values of 0.73 for a segmentation-free BoVW method, 0.86 for a segmentation-free CNN method, 0.75 for a segmentation-dependent linear discriminant analysis of Rubber-Band Straightening Transform features, and 0.58 for a hybrid rule-based neural network classification using a small number of hand-designed features.ConclusionsWe find that malignancy classification performance on the CBIS-DDSM dataset using segmentation-free BoVW features is comparable to that of the best segmentation-dependent methods we study, but also observe that a common segmentation-free CNN model substantially and significantly outperforms each of these (p < 0.05). These results reinforce recent findings suggesting that representation learning techniques such as BoVW and CNNs are advantageous for mammogram analysis because they do not require lesion segmentation, the quality and specific characteristics of which can vary substantially across datasets. We further observe that segmentation-dependent methods achieve performance levels on CBIS-DDSM inferior to those achieved on the original evaluation datasets reported in the literature. Each of these findings reinforces the need for standardization of datasets, segmentation techniques, and model implementations in performance assessments of automated classifiers for medical imaging.

Project description:Radiologists' diagnostic capabilities for breast mass lesions depend on their experience. Junior radiologists may underestimate or overestimate Breast Imaging Reporting and Data System (BI-RADS) categories of mass lesions owing to a lack of diagnostic experience. The computer-aided diagnosis (CAD) method assists in improving diagnostic performance by providing a breast mass classification reference to radiologists. This study aims to evaluate the impact of a CAD method based on perceptive features learned from quantitative BI-RADS descriptions on breast mass diagnosis performance. We conducted a retrospective multi-reader multi-case (MRMC) study to assess the perceptive feature-based CAD method. A total of 416 digital mammograms of patients with breast masses were obtained from 2014 through 2017, including 231 benign and 185 malignant masses, from which we randomly selected 214 cases (109 benign, 105 malignant) to train the CAD model for perceptive feature extraction and classification. The remaining 202 cases were enrolled as the test set for evaluation, of which 51 patients (29 benign and 22 malignant) participated in the MRMC study. In the MRMC study, we categorized six radiologists into three groups: junior, middle-senior, and senior. They diagnosed 51 patients with and without support from the CAD model. The BI-RADS category, benign or malignant diagnosis, malignancy probability, and diagnosis time during the two evaluation sessions were recorded. In the MRMC evaluation, the average area under the curve (AUC) of the six radiologists with CAD support was slightly higher than that without support (0.896 vs. 0.850, p = 0.0209). Both average sensitivity and specificity increased (p = 0.0253). Under CAD assistance, junior and middle-senior radiologists adjusted the assessment categories of more BI-RADS 4 cases. The diagnosis time with and without CAD support was comparable for five radiologists. The CAD model improved the radiologists' diagnostic performance for breast masses without prolonging the diagnosis time and assisted in a better BI-RADS assessment, especially for junior radiologists.

Dataset Information

A curated mammography data set for use in computer-aided detection and diagnosis research.

Publications

A curated mammography data set for use in computer-aided detection and diagnosis research.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets