Deep learning‐based 3D classification of head and neck cancer PET/MRI: Radiologist comparison and Grad‐CAM interpretability - UTU Tutkimustietojärjestelmä

A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä

Deep learning‐based 3D classification of head and neck cancer PET/MRI: Radiologist comparison and Grad‐CAM interpretability

Tekijät: Liedes, Joonas; Hirvonen, Jussi; Rainio, Oona; Murtojärvi, Sarita; Malaspina, Simona; Klén, Riku; Kemppainen, Jukka

Julkaisuvuosi: 2025

Lehti: Clinical Physiology and Functional Imaging

Tietokannassa oleva lehden nimi: CLINICAL PHYSIOLOGY AND FUNCTIONAL IMAGING

Artikkelin numero: e70030

Vuosikerta: 45

Numero: 5

ISSN: 1475-0961

eISSN: 1475-097X

DOI: https://doi.org/10.1111/cpf.70030

Julkaisun avoimuus kirjaamishetkellä: Avoimesti saatavilla

Julkaisukanavan avoimuus : Osittain avoin julkaisukanava

Verkko-osoite: https://doi.org/10.1111/cpf.70030

Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/504538722

Rinnakkaistallenteen lisenssi: CC BY

Rinnakkaistallennetun julkaisun versio: Kustantajan versio

Tiivistelmä

Purpose:

To develop and evaluate a three-dimensional convolutional neural network for automated classification of PET/MRI images in head and neck cancer (HNC) patients, assessing its performance against radiologist interpretation and its potential as a diagnostic aid.

Methods:

Data from 202 patients with HNC who underwent 18F-FDG PET/MRI were used to train and validate PET-, MRI-, and PET/MRI-based models. Of these data, 101 patients were labelled as positive in terms of having HNC, and 101 patients as negative. An additional test set of 20 patients was also evaluated, where 10 patients were labelled as positive and 10 as negative. The model performance was assessed using sensitivity, specificity, accuracy, and AUC. Grad-CAM was utilised to improve interpretability and classification results on the test set were compared with a radiologist.

Results:
The PET-based model achieved an AUC of 0.92 on the test set, with an accuracy of 90%, a sensitivity of 100% and a specificity of 80%. PET/MRI and MRI-based models underperformed relative to the PET-based model. The radiologist achieved perfect classification accuracy. Analysis of Grad-CAM showed that the model classifications are based on real areas of interest. In addition, it gave valuable insight into using similar systems in identifying false positive findings.

Conclusion:

The PET-based model demonstrated high sensitivity, indicating its potential as a pre-screening tool for HNC. However, specificity requires improvement to reduce false-positive rates. Enhanced datasets and refinement of model architecture will be crucial before clinical adoption. Grad-CAM provides valuable insights into model decisions, aiding clinical integration.

Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

Liedes_etal_deep_learning-based_2025.pdf

Julkaisussa olevat rahoitustiedot:
Dr. Rainio received funding from the Sakari Alhopuro Foundation. Dr. Liedes and Dr. Kemppainen received funding from Cancer Foundation Finland. Dr. Hirvonen received funding from the Sigrid Jusélius Foundation. Open access publishing facilitated by Turun yliopisto, as part of the Wiley - FinELib agreement.