A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
Deep learning‐based 3D classification of head and neck cancer PET/MRI: Radiologist comparison and Grad‐CAM interpretability
Tekijät: Liedes, Joonas; Hirvonen, Jussi; Rainio, Oona; Murtojärvi, Sarita; Malaspina, Simona; Klén, Riku; Kemppainen, Jukka
Kustantaja: Wiley
Julkaisuvuosi: 2025
Lehti:: Clinical Physiology and Functional Imaging
Tietokannassa oleva lehden nimi: CLINICAL PHYSIOLOGY AND FUNCTIONAL IMAGING
Artikkelin numero: e70030
Vuosikerta: 45
Numero: 5
ISSN: 1475-0961
eISSN: 1475-097X
DOI: https://doi.org/10.1111/cpf.70030
Verkko-osoite: https://doi.org/10.1111/cpf.70030
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/504538722
Purpose:
To develop and evaluate a three-dimensional convolutional neural network for automated classification of PET/MRI images in head and neck cancer (HNC) patients, assessing its performance against radiologist interpretation and its potential as a diagnostic aid.
Methods:
Data from 202 patients with HNC who underwent 18F-FDG PET/MRI were used to train and validate PET-, MRI-, and PET/MRI-based models. Of these data, 101 patients were labelled as positive in terms of having HNC, and 101 patients as negative. An additional test set of 20 patients was also evaluated, where 10 patients were labelled as positive and 10 as negative. The model performance was assessed using sensitivity, specificity, accuracy, and AUC. Grad-CAM was utilised to improve interpretability and classification results on the test set were compared with a radiologist.
Results:
The PET-based model achieved an AUC of 0.92 on the test set, with an accuracy of 90%, a sensitivity of 100% and a specificity of 80%. PET/MRI and MRI-based models underperformed relative to the PET-based model. The radiologist achieved perfect classification accuracy. Analysis of Grad-CAM showed that the model classifications are based on real areas of interest. In addition, it gave valuable insight into using similar systems in identifying false positive findings.
Conclusion:
The PET-based model demonstrated high sensitivity, indicating its potential as a pre-screening tool for HNC. However, specificity requires improvement to reduce false-positive rates. Enhanced datasets and refinement of model architecture will be crucial before clinical adoption. Grad-CAM provides valuable insights into model decisions, aiding clinical integration.
Ladattava julkaisu This is an electronic reprint of the original article. |
Julkaisussa olevat rahoitustiedot:
Dr. Rainio received funding from the Sakari Alhopuro Foundation. Dr. Liedes and Dr. Kemppainen received funding from Cancer Foundation Finland. Dr. Hirvonen received funding from the Sigrid Jusélius Foundation. Open access publishing facilitated by Turun yliopisto, as part of the Wiley - FinELib agreement.