Dataset Information

Development and Validation of an Artificial Intelligence-Powered Platform for Prostate Cancer Grading and Quantification.

ABSTRACT:

Importance

The Gleason grading system has been the most reliable tool for the prognosis of prostate cancer since its development. However, its clinical application remains limited by interobserver variability in grading and quantification, which has negative consequences for risk assessment and clinical management of prostate cancer.

Objective

To examine the impact of an artificial intelligence (AI)-assisted approach to prostate cancer grading and quantification.

Design, setting, and participants

This diagnostic study was conducted at the University of Wisconsin-Madison from August 2, 2017, to December 30, 2019. The study chronologically selected 589 men with biopsy-confirmed prostate cancer who received care in the University of Wisconsin Health System between January 1, 2005, and February 28, 2017. A total of 1000 biopsy slides (1 or 2 slides per patient) were selected and scanned to create digital whole-slide images, which were used to develop and validate a deep convolutional neural network-based AI-powered platform. The whole-slide images were divided into a training set (n = 838) and validation set (n = 162). Three experienced academic urological pathologists (W.H., K.A.I., and R.H., hereinafter referred to as pathologists 1, 2, and 3, respectively) were involved in the validation. Data were collected between December 29, 2018, and December 20, 2019, and analyzed from January 4, 2020, to March 1, 2021.

Main outcomes and measures

Accuracy of prostate cancer detection by the AI-powered platform and comparison of prostate cancer grading and quantification performed by the 3 pathologists using manual vs AI-assisted methods.

Results

Among 589 men with biopsy slides, the mean (SD) age was 63.8 (8.2) years, the mean (SD) prebiopsy prostate-specific antigen level was 10.2 (16.2) ng/mL, and the mean (SD) total cancer volume was 15.4% (20.1%). The AI system was able to distinguish prostate cancer from benign prostatic epithelium and stroma with high accuracy at the patch-pixel level, with an area under the receiver operating characteristic curve of 0.92 (95% CI, 0.88-0.95). The AI system achieved almost perfect agreement with the training pathologist (pathologist 1) in detecting prostate cancer at the patch-pixel level (weighted κ = 0.97; asymptotic 95% CI, 0.96-0.98) and in grading prostate cancer at the slide level (weighted κ = 0.98; asymptotic 95% CI, 0.96-1.00). Use of the AI-assisted method was associated with significant improvements in the concordance of prostate cancer grading and quantification between the 3 pathologists (eg, pathologists 1 and 2: 90.1% agreement using AI-assisted method vs 84.0% agreement using manual method; P < .001) and significantly higher weighted κ values for all pathologists (eg, pathologists 2 and 3: weighted κ = 0.92 [asymptotic 95% CI, 0.90-0.94] for AI-assisted method vs 0.76 [asymptotic 95% CI, 0.71-0.80] for manual method; P < .001) compared with the manual method.

Conclusions and relevance

In this diagnostic study, an AI-powered platform was able to detect, grade, and quantify prostate cancer with high accuracy and efficiency and was associated with significant reductions in interobserver variability. These results suggest that an AI-powered platform could potentially transform histopathological evaluation and improve risk stratification and clinical management of prostate cancer.

SUBMITTER: Huang W

PROVIDER: S-EPMC8567112 | biostudies-literature | 2021 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Development and Validation of an Artificial Intelligence-Powered Platform for Prostate Cancer Grading and Quantification.

Huang Wei W Randhawa Ramandeep R Jain Parag P Iczkowski Kenneth A KA Hu Rong R Hubbard Samuel S Eickhoff Jens J Basu Hirak H Roy Rajat R

JAMA network open 20211101 11

<h4>Importance</h4>The Gleason grading system has been the most reliable tool for the prognosis of prostate cancer since its development. However, its clinical application remains limited by interobserver variability in grading and quantification, which has negative consequences for risk assessment and clinical management of prostate cancer.<h4>Objective</h4>To examine the impact of an artificial intelligence (AI)-assisted approach to prostate cancer grading and quantification.<h4>Design, settin ...[more]

PMID: 34730818

Similar Datasets

Project description:BackgroundThe management of acne requires the consideration of its severity; however, a universally adopted evaluation system for clinical practice is lacking. Artificial intelligence (AI) evaluation systems hold the promise of enhancing the efficiency and reproducibility of assessments. Artificial intelligence (AI) evaluation systems offer the potential to enhance the efficiency and reproducibility of assessments in this domain. While the identification of skin lesions represents a crucial component of acne evaluation, existing AI systems often overlook lesion identification or fail to integrate it with severity assessment. This study aimed to develop an AI-powered acne grading system and compare its performance with physician image-based scoring.MethodsA total of 1,501 acne patients were included in the study, and standardized pictures were obtained using the VISIA system. The initial evaluation involved 40 stratified sampled frontal photos assessed by seven dermatologists. Subsequently, the three doctors with the highest inter-rater agreement annotated the remaining 1,461 images, which served as the dataset for the development of the AI system. The dataset was randomly divided into two groups: 276 images were allocated for training the acne lesion identification platform, and 1,185 images were used to assess the severity of acne.ResultsThe average precision of our model for skin lesion identification was 0.507 and the average recall was 0.775. The AI severity grading system achieved good agreement with the true label (linear weighted kappa = 0.652). After integrating the lesion identification results into the severity assessment with fixed weights and learnable weights, the kappa rose to 0.737 and 0.696, respectively, and the entire evaluation on a Linux workstation with a Tesla K40m GPU took less than 0.1s per picture.ConclusionThis study developed a system that detects various types of acne lesions and correlates them well with acne severity grading, and the good accuracy and efficiency make this approach potentially an effective clinical decision support tool.

Project description:BackgroundAssessment of spine alignment is crucial in the management of scoliosis, but current auto-analysis of spine alignment suffers from low accuracy. We aim to develop and validate a hybrid model named SpineHRNet+, which integrates artificial intelligence (AI) and rule-based methods to improve auto-alignment reliability and interpretability.MethodsFrom December 2019 to November 2020, 1,542 consecutive patients with scoliosis attending two local scoliosis clinics (The Duchess of Kent Children's Hospital at Sandy Bay in Hong Kong; Queen Mary Hospital in Pok Fu Lam on Hong Kong Island) were recruited. The biplanar radiographs of each patient were collected with our medical machine EOS™. The collected radiographs were recaptured using smartphones or screenshots, with deidentified images securely stored. Manually labelled landmarks and alignment parameters by a spine surgeon were considered as ground truth (GT). The data were split 8:2 to train and internally test SpineHRNet+, respectively. This was followed by a prospective validation on another 337 patients. Quantitative analyses of landmark predictions were conducted, and reliabilities of auto-alignment were assessed using linear regression and Bland-Altman plots. Deformity severity and sagittal abnormality classifications were evaluated by confusion matrices.FindingsSpineHRNet+ achieved accurate landmark detection with mean Euclidean distance errors of 2·78 and 5·52 pixels on posteroanterior and lateral radiographs, respectively. The mean angle errors between predictions and GT were 3·18° and 6·32° coronally and sagittally. All predicted alignments were strongly correlated with GT (p < 0·001, R2 > 0·97), with minimal overall difference visualised via Bland-Altman plots. For curve detections, 95·7% sensitivity and 88·1% specificity was achieved, and for severity classification, 88·6-90·8% sensitivity was obtained. For sagittal abnormalities, greater than 85·2-88·9% specificity and sensitivity were achieved.InterpretationThe auto-analysis provided by SpineHRNet+ was reliable and continuous and it might offer the potential to assist clinical work and facilitate large-scale clinical studies.FundingRGC Research Impact Fund (R5017-18F), Innovation and Technology Fund (ITS/404/18), and the AOSpine East Asia Fund (AOSEA(R)2019-06).

Dataset Information

Development and Validation of an Artificial Intelligence-Powered Platform for Prostate Cancer Grading and Quantification.

Importance

Objective

Design, setting, and participants

Main outcomes and measures

Results

Conclusions and relevance

Publications

Development and Validation of an Artificial Intelligence-Powered Platform for Prostate Cancer Grading and Quantification.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets