A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä

Efficient Search Algorithms for Identifying Synergistic Associations in High-Dimensional Datasets




TekijätHourican, Cillian; Li, Jie; Mishra, Pashupati P.; Lehtimäki, Terho; Mishra, Binisha H.; Kähönen, Mika; Raitakari, Olli T.; Laaksonen, Reijo; Keltikangas-Järvinen, Liisa; Juonala, Markus; Quax, Rick

KustantajaMDPI AG

Julkaisuvuosi2024

JournalEntropy

Tietokannassa oleva lehden nimiEntropy

Lehden akronyymiEntropy (Basel)

Artikkelin numero968

Vuosikerta26

Numero11

eISSN1099-4300

DOIhttps://doi.org/10.3390/e26110968

Verkko-osoitehttp://doi.org/10.3390/e26110968

Rinnakkaistallenteen osoitehttps://research.utu.fi/converis/portal/detail/Publication/477015493


Tiivistelmä
In recent years, there has been a notably increased interest in the study of multivariate interactions and emergent higher-order dependencies. This is particularly evident in the context of identifying synergistic sets, which are defined as combinations of elements whose joint interactions result in the emergence of information that is not present in any individual subset of those elements. The scalability of frameworks such as partial information decomposition (PID) and those based on multivariate extensions of mutual information, such as O-information, is limited by combinational explosion in the number of sets that must be assessed. In order to address these challenges, we propose a novel approach that utilises stochastic search strategies in order to identify synergistic triplets within datasets. Furthermore, the methodology is extensible to larger sets and various synergy measures. By employing stochastic search, our approach circumvents the constraints of exhaustive enumeration, offering a scalable and efficient means to uncover intricate dependencies. The flexibility of our method is illustrated through its application to two epidemiological datasets: The Young Finns Study and the UK Biobank Nuclear Magnetic Resonance (NMR) data. Additionally, we present a heuristic for reducing the number of synergistic sets to analyse in large datasets by excluding sets with overlapping information. We also illustrate the risks of performing a feature selection before assessing synergistic information in the system.

Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.




Julkaisussa olevat rahoitustiedot
This work was supported by the Netherlands Organisation for Health Research and Development (ZonMw), Open Competition Grant 09120012010063 and the ToAition project, which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 848146. The Young Finns Study has been financially supported by the Academy of Finland: grants 356405, 322098, 286284, 134309 (Eye), 126925, 121584, 124282, 129378 (Salve), 117797 (Gendi), and 141071 (Skidi); the Social Insurance Institution of Finland; Competitive State Research Financing of the Expert Responsibility area of Kuopio, Tampere and Turku University Hospitals (grant X51001); Juho Vainio Foundation; Paavo Nurmi Foundation; Finnish Foundation for Cardiovascular Research; Finnish Cultural Foundation; The Sigrid Juselius Foundation; Tampere Tuberculosis Foundation; Emil Aaltonen Foundation; YrjÃű Jahnsson Foundation; Signe and Ane Gyllenberg Foundation; Diabetes Research Foundation of Finnish Diabetes Association; EU Horizon 2020 (grant 755320 for TAXINOMISIS and grant 848146 for To Aition); European Research Council (grant 742927 for MULTIEPIGEN project); Tampere University Hospital Supporting Foundation; Finnish Society of Clinical Chemistry; the Cancer Foundation Finland; pBETTER4U_EU (Preventing obesity through Biologically and bEhaviorally Tailored inTERventions for you; project number: 101080117); and the Jane and Aatos Erkko Foundation. PPM was supported by the Academy of Finland (Grant number: 349708).


Last updated on 2025-27-01 at 19:33