Test-retest repeatability of a deep learning architecture in detecting and segmenting clinically significant prostate cancer on apparent diffusion coefficient (ADC) maps

: Hiremath Amogh, Shiradkar Rakesh, Merisaari Harri, Prasanna Prateek, Ettala Otto, Taimen Pekka, Aronen Hannu J, Boström Peter J, Jambor Ivan, Madabhushi Anant

Publisher: Springer

: 2021

European Radiology

: 31

: 1

: 379

: 391

: 13

: 0938-7994

: 1432-1084

DOI: https://doi.org/10.1007/s00330-020-07065-4

: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7821380/

Objectives: To evaluate short-term test-retest repeatability of a deep learning architecture (U-Net) in slice- and lesion-level detection and segmentation of clinically significant prostate cancer (csPCa: Gleason grade group > 1) using diffusion-weighted imaging fitted with monoexponential function, ADCm.

Methods: One hundred twelve patients with prostate cancer (PCa) underwent 2 prostate MRI examinations on the same day. PCa areas were annotated using whole mount prostatectomy sections. Two U-Net-based convolutional neural networks were trained on three different ADCm b value settings for (a) slice- and (b) lesion-level detection and (c) segmentation of csPCa. Short-term test-retest repeatability was estimated using intra-class correlation coefficient (ICC(3,1)), proportionate agreement, and dice similarity coefficient (DSC). A 3-fold cross-validation was performed on training set (N = 78 patients) and evaluated for performance and repeatability on testing data (N = 34 patients).

Results: For the three ADCm b value settings, repeatability of mean ADCm of csPCa lesions was ICC(3,1) = 0.86-0.98. Two CNNs with U-Net-based architecture demonstrated ICC(3,1) in the range of 0.80-0.83, agreement of 66-72%, and DSC of 0.68-0.72 for slice- and lesion-level detection and segmentation of csPCa. Bland-Altman plots suggest that there is no systematic bias in agreement between inter-scan ground truth segmentation repeatability and segmentation repeatability of the networks.

Conclusions: For the three ADCm b value settings, two CNNs with U-Net-based architecture were repeatable for the problem of detection of csPCa at the slice-level. The network repeatability in segmenting csPCa lesions is affected by inter-scan variability and ground truth segmentation repeatability and may thus improve with better inter-scan reproducibility.

Key points: • For the three ADCm b value settings, two CNNs with U-Net-based architecture were repeatable for the problem of detection of csPCa at the slice-level. • The network repeatability in segmenting csPCa lesions is affected by inter-scan variability and ground truth segmentation repeatability and may thus improve with better inter-scan reproducibility.