Development and retrospective validation of an artificial intelligence system for diagnostic assessment of prostate biopsies: study protocol

: Mulliqi, Nita; Blilie, Anders; Ji, Xiaoyi; Szolnoky, Kelvin; Olsson, Henrik; Titus, Matteo; Martinez Gonzalez, Geraldine; Boman, Sol Erika; Valkonen, Masi; Gudlaugsson, Einar; Kjosavik, Svein Reidar; Asenjo, Jose; Gambacorta, Marcello; Libretti, Paolo; Braun, Marcin; Kordek, Radzislaw; Lowicki, Roman; Hotakainen, Kristina; Vare, Paivi; Pedersen, Bodil Ginnerup; Sorensen, Karina Dalsgaard; Ulhoi, Benedicte Parm; Rantalainen, Mattias; Ruusuvuori, Pekka; Delahunt, Brett; Samaratunga, Hemamali; Tsuzuki, Toyonori; Janssen, Emilius Adrianus Maria; Egevad, Lars; Kartasalo, Kimmo; Eklund, Martin

Publisher: BMJ

: LONDON

: 2025

BMJ Open

: BMJ Open

: BMJ OPEN

: e097591

: 15

: 7

: 19

: 2044-6055

DOI: https://doi.org/10.1136/bmjopen-2024-097591

: https://doi.org/10.1136/bmjopen-2024-097591

: https://research.utu.fi/converis/portal/detail/Publication/499383281

Introduction: Histopathological evaluation of prostate biopsies using the Gleason scoring system is critical for prostate cancer diagnosis and treatment selection. However, grading variability among pathologists can lead to inconsistent assessments, risking inappropriate treatment. Similar challenges complicate the assessment of other prognostic features like cribriform cancer morphology and perineural invasion. Many pathology departments are also facing an increasingly unsustainable workload due to rising prostate cancer incidence and a decreasing pathologist workforce coinciding with increasing requirements for more complex assessments and reporting. Digital pathology and artificial intelligence (AI) algorithms for analysing whole slide images show promise in improving the accuracy and efficiency of histopathological assessments. Studies have demonstrated AI's capability to diagnose and grade prostate cancer comparably to expert pathologists. However, external validations on diverse data sets have been limited and often show reduced performance. Historically, there have been no well-established guidelines for AI study designs and validation methods. Diagnostic assessments of AI systems often lack preregistered protocols and rigorous external cohort sampling, essential for reliable evidence of their safety and accuracy.

Methods and analysis: This study protocol covers the retrospective validation of an AI system for prostate biopsy assessment. The primary objective of the study is to develop a high-performing and robust AI model for diagnosis and Gleason scoring of prostate cancer in core needle biopsies, and at scale evaluate whether it can generalise to fully external data from independent patients, pathology laboratories and digitalisation platforms. The secondary objectives cover AI performance in estimating cancer extent and detecting cribriform prostate cancer and perineural invasion. This protocol outlines the steps for data collection, predefined partitioning of data cohorts for AI model training and validation, model development and predetermined statistical analyses, ensuring systematic development and comprehensive validation of the system. The protocol adheres to Transparent Reporting of a multivariable prediction model of Individual Prognosis Or Diagnosis+AI (TRIPOD+AI), Protocol Items for External Cohort Evaluation of a Deep Learning System in Cancer Diagnostics (PIECES), Checklist for AI in Medical Imaging (CLAIM) and other relevant best practices.

Ethics and dissemination: Data collection and usage were approved by the respective ethical review boards of each participating clinical laboratory, and centralised anonymised data handling was approved by the Swedish Ethical Review Authority. The study will be conducted in agreement with the Helsinki Declaration. The findings will be disseminated in peer-reviewed publications (open access)

e097591.full.pdf

:
AB received a grant from the Health Faculty at the University of Stavanger, Norway. BGP and KDS received funding from Innovation Fund Denmark (Grant no. 8114-00014B) for the Danish branch of the NordCaP project. MR received funding from the Swedish Research Council and the Swedish Cancer Society. PR received funding from the Research Council of Finland (Grant no. 341967) and the Cancer Foundation Finland. ME received funding from the Swedish Research Council, Swedish Cancer Society, Swedish Prostate Cancer Society, Nordic Cancer Union, Karolinska Institutet, and Region Stockholm. KK received funding from the SciLifeLab & Wallenberg Data Driven Life Science Program (KAW 2024.0159), the David and Astrid Hagelen Foundation, Instrumentarium Science Foundation, KAUTE Foundation, Karolinska Institute Research Foundation, Orion Research Foundation and Oskar Huttunen Foundation.