Dataset Information

Predicting classifier performance with limited training data: applications to computer-aided diagnosis in breast and prostate cancer.

ABSTRACT: Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for large datasets based on error rates calculated from smaller datasets and (2) the selection of appropriate classifiers based on expected performance for larger datasets. We present a framework for comparative evaluation of classifiers using only limited amounts of training data by using random repeated sampling (RRS) in conjunction with a cross-validation sampling strategy. Extrapolated error rates are subsequently validated via comparison with leave-one-out cross-validation performed on a larger dataset. The ability to predict error rates as dataset size increases is demonstrated on both synthetic data as well as three different computational imaging tasks: detecting cancerous image regions in prostate histopathology, differentiating high and low grade cancer in breast histopathology, and detecting cancerous metavoxels in prostate magnetic resonance spectroscopy. For each task, the relationships between 3 distinct classifiers (k-nearest neighbor, naive Bayes, Support Vector Machine) are explored. Further quantitative evaluation in terms of interquartile range (IQR) suggests that our approach consistently yields error rates with lower variability (mean IQRs of 0.0070, 0.0127, and 0.0140) than a traditional RRS approach (mean IQRs of 0.0297, 0.0779, and 0.305) that does not employ cross-validation sampling for all three datasets.

SUBMITTER: Basavanhally A

PROVIDER: S-EPMC4436385 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Predicting classifier performance with limited training data: applications to computer-aided diagnosis in breast and prostate cancer.

Basavanhally Ajay A Viswanath Satish S Madabhushi Anant A

PloS one 20150518 5

Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for l ...[more]

PMID: 25993029

Similar Datasets

Project description:BACKGROUND:Diagnoses of Skin diseases are frequently delayed in China due to lack of dermatologists. A deep learning-based diagnosis supporting system can facilitate pre-screening patients to prioritize dermatologists' efforts. We aimed to evaluate the classification sensitivity and specificity of deep learning models to classify skin tumors and psoriasis for Chinese population with a modest number of dermoscopic images. METHODS:We developed a convolutional neural network (CNN) based on two datasets from a consecutive series of patients who underwent the dermoscopy in the clinic of the Department of Dermatology, Peking Union Medical College Hospital, between 2016 and 2018, prospectively. In order to evaluate the feasibility of the algorithm, we used two datasets. Dataset I consisted of 7192 dermoscopic images for a multi-class model to differentiate three most common skin tumors and other diseases. Dataset II consisted of 3115 dermoscopic images for a two-class model to classify psoriasis from other inflammatory diseases. We compared the performance of CNN with 164 dermatologists in a reader study with 130 dermoscopic images. The experts' consensus was used as the reference standard except for the cases of basal cell carcinoma (BCC), which were all confirmed by histopathology. RESULTS:The accuracies of multi-class and two-class models were 81.49%?±?0.88% and 77.02%?±?1.81%, respectively. In the reader study, for the multi-class tasks, the diagnosis sensitivity and specificity of 164 dermatologists were 0.770 and 0.962 for BCC, 0.807 and 0.897 for melanocytic nevus, 0.624 and 0.976 for seborrheic keratosis, 0.939 and 0.875 for the "others" group, respectively; the diagnosis sensitivity and specificity of multi-class CNN were 0.800 and 1.000 for BCC, 0.800 and 0.840 for melanocytic nevus, 0.850 and 0.940 for seborrheic keratosis, 0.750 and 0.940 for the "others" group, respectively. For the two-class tasks, the sensitivity and specificity of dermatologists and CNN for classifying psoriasis were 0.872 and 0.838, 1.000 and 0.605, respectively. Both the dermatologists and CNN achieved at least moderate consistency with the reference standard, and there was no significant difference in Kappa coefficients between them (P?>?0.05). CONCLUSIONS:The performance of CNN developed with relatively modest number of dermoscopic images of skin tumors and psoriasis for Chinese population is comparable with 164 dermatologists. These two models could be used for screening in patients suspected with skin tumors and psoriasis respectively in primary care hospital.

Project description:Training deep Convolutional Neural Networks (CNNs) presents challenges in terms of memory requirements and computational resources, often resulting in issues such as model overfitting and lack of generalization. These challenges can only be mitigated by using an excessive number of training images. However, medical image datasets commonly suffer from data scarcity due to the complexities involved in their acquisition, preparation, and curation. To address this issue, we propose a compact and hybrid machine learning architecture based on the Morphological and Convolutional Neural Network (MCNN), followed by a Random Forest classifier. Unlike deep CNN architectures, the MCNN was specifically designed to achieve effective performance with medical image datasets limited to a few hundred samples. It incorporates various morphological operations into a single layer and uses independent neural networks to extract information from each signal channel. The final classification is obtained by utilizing a Random Forest classifier on the outputs of the last neural network layer. We compare the classification performance of our proposed method with three popular deep CNN architectures (ResNet-18, ShuffleNet-V2, and MobileNet-V2) using two training approaches: full training and transfer learning. The evaluation was conducted on two distinct medical image datasets: the ISIC dataset for melanoma classification and the ORIGA dataset for glaucoma classification. Results demonstrate that the MCNN method exhibits reliable performance in melanoma classification, achieving an AUC of 0.94 (95% CI: 0.91 to 0.97), outperforming the popular CNN architectures. For the glaucoma dataset, the MCNN achieved an AUC of 0.65 (95% CI: 0.53 to 0.74), which was similar to the performance of the popular CNN architectures. This study contributes to the understanding of mathematical morphology in shallow neural networks for medical image classification and highlights the potential of hybrid architectures in effectively learning from medical image datasets that are limited by a small number of case samples.

Dataset Information

Predicting classifier performance with limited training data: applications to computer-aided diagnosis in breast and prostate cancer.

Publications

Predicting classifier performance with limited training data: applications to computer-aided diagnosis in breast and prostate cancer.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets