Dataset Information

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.

ABSTRACT: BACKGROUND:Computer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors. OBJECTIVE:This study aimed to investigate the robustness of deep convolutional neural networks (CNNs) for binary classification of posteroanterior chest x-ray through random incorrect labeling. METHODS:We trained and validated the CNN architecture with different noise levels of labels in 3 datasets, namely, Asan Medical Center-Seoul National University Bundang Hospital (AMC-SNUBH), NIH, and CheXpert, and tested the models with each test set. Diseases of each chest x-ray in our dataset were confirmed by a thoracic radiologist using computed tomography (CT). Receiver operating characteristic (ROC) and area under the curve (AUC) were evaluated in each test. Randomly chosen chest x-rays of public datasets were evaluated by 3 physicians and 1 thoracic radiologist. RESULTS:In comparison with the public datasets of NIH and CheXpert, where AUCs did not significantly drop to 16%, the AUC of the AMC-SNUBH dataset significantly decreased from 2% label noise. Evaluation of the public datasets by 3 physicians and 1 thoracic radiologist showed an accuracy of 65%-80%. CONCLUSIONS:The deep learning-based computer-aided diagnosis model is sensitive to label noise, and computer-aided diagnosis with inaccurate labels is not credible. Furthermore, open datasets such as NIH and CheXpert need to be distilled before being used for deep learning-based computer-aided diagnosis.

SUBMITTER: Jang R

PROVIDER: S-EPMC7435602 | biostudies-literature | 2020 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.

Jang Ryoungwoo R Kim Namkug N Jang Miso M Lee Kyung Hwa KH Lee Sang Min SM Lee Kyung Hee KH Noh Han Na HN Seo Joon Beom JB

JMIR medical informatics 20200804 8

<h4>Background</h4>Computer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors.<h4>Objective</h4>This study aimed to investigate the robustness of deep convolutional neural networks (CN ...[more]

PMID: 32749222

Similar Datasets

Project description:In this paper, we propose a new Modified Laplacian Vector Median Filter (MLVMF) for real-time denoising complex images corrupted by "salt and pepper" impulsive noise. The method consists of two rounds with three steps each: the first round starts with the identification of pixels that may be contaminated by noise using a Modified Laplacian Filter. Then, corrupted pixels pass a neighborhood-based validation test. Finally, the Vector Median Filter is used to replace noisy pixels. The MLVMF uses a 5 × 5 window to observe the intensity variations around each pixel of the image with a rotation step of π/8 while the classic Laplacian filters often use rotation steps of π/2 or π/4. We see better identification of noise-corrupted pixels thanks to this rotation step refinement. Despite this advantage, a high percentage of the impulsive noise may cause two or more corrupted pixels (with the same intensity) to collide, preventing the identification of noise-corrupted pixels. A second round is then necessary using a second set of filters, still based on the Laplacian operator, but allowing focusing only on the collision phenomenon. To validate our method, MLVMF is firstly tested on standard images, with a noise percentage varying from 3% to 30%. Obtained performances in terms of processing time, as well as image restoration quality through the PSNR (Peak Signal to Noise Ratio) and the NCD (Normalized Color Difference) metrics, are compared to the performances of VMF (Vector Median Filter), VMRHF (Vector Median-Rational Hybrid Filter), and MSMF (Modified Switching Median Filter). A second test is performed on several noisy chest x-ray images used in cardiovascular disease diagnosis as well as COVID-19 diagnosis. The proposed method shows a very good quality of restoration on this type of image, particularly when the percentage of noise is high. The MLVMF provides a high PSNR value of 5.5% and a low NCD value of 18.2%. Finally, an optimized Field-Programmable Gate Array (FPGA) design is proposed to implement the proposed method for real-time processing. The proposed hardware implementation allows an execution time equal to 9 ms per 256 × 256 color image.

Dataset Information

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.

Publications

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets