A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
The sensitivity of patient-reported outcome measures in surgical and non-surgical care: a systematic review and meta-epidemiological evaluation of randomised controlled trials
Tekijät: Uimonen, Mikko; Vaajala, Matias; Saarinen, Antti; Liukkonen, Rasmus; Pakarinen, Oskari; Laaksonen, Juho; Ponkilainen, Ville; Kuitunen, Ilari; Panula, Valtteri
Kustantaja: Elsevier BV
Julkaisuvuosi: 2026
Lehti: EClinicalMedicine
Artikkelin numero: 103776
Vuosikerta: 92
eISSN: 2589-5370
DOI: https://doi.org/10.1016/j.eclinm.2026.103776
Julkaisun avoimuus kirjaamishetkellä: Avoimesti saatavilla
Julkaisukanavan avoimuus : Kokonaan avoin julkaisukanava
Verkko-osoite: https://doi.org/10.1016/j.eclinm.2026.103776
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/508959943
Rinnakkaistallenteen lisenssi: CC BY
Rinnakkaistallennetun julkaisun versio: Kustantajan versio
Background: Accumulation of score distribution towards the high end of the measurement scale is an important source of bias related patient-reported outcome measures (PROM). The aim was to evaluate how PROM score distributions, scale boundaries, and sampling variability influence the likelihood of detecting a minimal clinically important difference (MCID) of 10 points between surgical and non-surgical groups in randomised controlled trials (RCTs) of musculoskeletal disorders.
Methods: We did a systematic review and meta-epidemiological analysis of 129 RCT studies comparing surgical and non-surgical interventions in patients with musculoskeletal complaints using a PROM as an outcome measure (1771 group-level PROM measurements) from PubMed and Scopus published until February 26, 2025. Simulations assessed each comparison's likelihood of detecting a difference of 10 points or more.
Findings: The mean difference between groups was 4.6 (SD 7.1) points favouring surgery, with surgical arms scoring higher in 72% of comparisons. The mean likelihood of detecting at least a 10-point difference was 19%, meaning fewer than one in five of such comparisons would detect a true difference. Detection likelihood peaked (35%) at a mean score of 70, declining toward scale extremes. Comparisons with significant observed differences (>10 points, p < 0.05) had a 54% likelihood versus 17% in non-significant comparisons, strongly linking detection likelihood to observed differences.
Interpretation: The majority of the PROM-based RCTs were unlikely to detect differences due to ceiling effects with a constant underestimation of surgical benefit. PROMs with adequate content coverage, better discrimination, and reduced ceiling susceptibility should be selected for clinical practice. Future research should align outcome selection and follow-up timing with expected treatment effects and ensure that measurement properties do not mask meaningful clinical differences.
Ladattava julkaisu This is an electronic reprint of the original article. |