A1 Refereed original research article in a scientific journal

Permutation-based significance analysis reduces the type 1 error rate in bisulfite sequencing data analysis of human umbilical cord blood samples




AuthorsLaajala Essi, Halla-Aho Viivi, Grönroos Toni, Kalim Ubaid Ullah, Vähä-Mäkilä Mari, Nurmio Mirja, Kallionpää Henna, Lietzén Niina, Mykkänen Juha, Rasool Omid, Toppari Jorma, Oresic Matej, Knip Mikael, Lund Riikka, Lahesmaa Riitta, Lähdesmäki Harri

PublisherTAYLOR & FRANCIS INC

Publication year2022

JournalEpigenetics

Journal name in sourceEPIGENETICS

Journal acronymEPIGENETICS-US

Number of pages20

ISSN1559-2294

eISSN1559-2308

DOIhttps://doi.org/10.1080/15592294.2022.2044127

Web address https://doi.org/10.1080/15592294.2022.2044127

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/175056164


Abstract

DNA methylation patterns are largely established in-utero and might mediate the impacts of in-utero conditions on later health outcomes. Associations between perinatal DNA methylation marks and pregnancy-related variables, such as maternal age and gestational weight gain, have been earlier studied with methylation microarrays, which typically cover less than 2% of human CpG sites. To detect such associations outside these regions, we chose the bisulphite sequencing approach. We collected and curated clinical data on 200 newborn infants; whose umbilical cord blood samples were analysed with the reduced representation bisulphite sequencing (RRBS) method. A generalized linear mixed-effects model was fit for each high coverage CpG site, followed by spatial and multiple testing adjustment of P values to identify differentially methylated cytosines (DMCs) and regions (DMRs) associated with clinical variables, such as maternal age, mode of delivery, and birth weight. Type 1 error rate was then evaluated with a permutation analysis. We discovered a strong inflation of spatially adjusted P values through the permutation analysis, which we then applied for empirical type 1 error control. The inflation of P values was caused by a common method for spatial adjustment and DMR detection, implemented in tools comb-p and RADMeth. Based on empirically estimated significance thresholds, very little differential methylation was associated with any of the studied clinical variables, other than sex. With this analysis workflow, the sex-associated differentially methylated regions were highly reproducible across studies, technologies, and statistical models.


Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 13:00