A1 Refereed original research article in a scientific journal
Towards practical federated learning and evaluation for medical prediction models
Authors: Kazlouski, Andrei; Montoya, Perez Ileana; Noor, Faiza; Högerman, Mikael; Ettala, Otto; Pahikkala, Tapio; Airola, Antti
Publisher: Elsevier BV
Publication year: 2025
Journal: International Journal of Medical Informatics
Journal name in source: International Journal of Medical Informatics
Article number: 106046
Volume: 204
ISSN: 1386-5056
DOI: https://doi.org/10.1016/j.ijmedinf.2025.106046
Web address : https://www.sciencedirect.com/science/article/pii/S1386505625002631?via%3Dihub
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/499495628
Background: Federated learning (FL) is a rapidly advancing technique that enables collaborative model training while preserving data privacy. This approach is particularly relevant in healthcare, where privacy concerns and regulatory restrictions often prevent centralized data sharing. FL has shown promise in tasks such as disease detection, achieving performance levels comparable to centralized systems. However, its practical usability in real-world applications remains underexplored.
Methods: We evaluate the practical effectiveness of FL in predicting whether patients suspected of prostate cancer require invasive biopsy procedures. The study uses 14 publicly available prostate cancer datasets from 10 countries. We propose and benchmark a novel FL evaluation strategy, Leave-Silo-Out (LSO), which quantifies the performance gap between federated training and free-riding (utilizing the federated model without contributing data). Additionally, we investigate whether locally trained models can outperform multi-hospital FL models. The results are assessed with a focus on improving the diagnosis of local patients.
Results: Our findings reveal that the benefits of FL vary with the amount of locally available annotated data. Hospitals with very small datasets see negligible improvements from FL compared to free-riding. Institutions with moderate datasets may achieve some gains through FL training. However, hospitals with extensive datasets often experience little to no advantage from FL and, in some cases, observe reduced performance compared to local training.
Conclusion: Federated learning shows potential in scenarios with limited data availability. However, its practical applicability is highly context-dependent, influenced by factors such as data availability and specific task requirements.
Downloadable publication This is an electronic reprint of the original article. |
Funding information in the publication:
This work has received funding from European Union’s Horizon Europe research and innovation programme (grant number 101095384) and from Research Council of Finland (grants 358868, 345804, 345805, 340140, 340182).