A1 Refereed original research article in a scientific journal

Robust data-driven identification of risk factors and their interactions: A simulation and a study of parental and demographic risk factors for schizophrenia




AuthorsGyllenberg D, McKeague IW, Sourander A, Brown AS

PublisherWILEY

Publication year2020

JournalInternational Journal of Methods in Psychiatric Research

Journal name in sourceINTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH

Journal acronymINT J METH PSYCH RES

Article numberARTN e1834

Volume29

Issue4

First page 1

Last page11

Number of pages11

ISSN1049-8931

DOIhttps://doi.org/10.1002/mpr.1834

Web address https://doi.org/10.1002/mpr.1834

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/48528693


Abstract
Objectives Few interactions between risk factors for schizophrenia have been replicated, but fitting all such interactions is difficult due to high-dimensionality. Our aims are to examine significant main and interaction effects for schizophrenia and the performance of our approach using simulated data.Methods We apply the machine learning technique elastic net to a high-dimensional logistic regression model to produce a sparse set of predictors, and then assess the significance of odds ratios (OR) with Bonferroni-corrected p-values and confidence intervals (CI). We introduce a simulation model that resembles a Finnish nested case-control study of schizophrenia which uses national registers to identify cases (n = 1,468) and controls (n = 2,975). The predictors include nine sociodemographic factors and all interactions (31 predictors).Results In the simulation, interactions with OR = 3 and prevalence = 4% were identified with <5% false positive rate and >= 80% power. None of the studied interactions were significantly associated with schizophrenia, but main effects of parental psychosis (OR = 5.2, CI 2.9-9.7; p < .001), urbanicity (1.3, 1.1-1.7; p = .001), and paternal age >= 35 (1.3, 1.004-1.6; p = .04) were significant.Conclusions We have provided an analytic pipeline for data-driven identification of main and interaction effects in case-control data. We identified highly replicated main effects for schizophrenia, but no interactions.

Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 21:36