A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
KANN: estimation of genetic ancestry profiles by nearest neighbor regression
Tekijät: Riikonen, Juha; Kerminen, Sini; Havulinna, Aki; Pirinen, Matti
Kustantaja: Oxford University Press (OUP)
Julkaisuvuosi: 2026
Lehti: Nucleic Acids Research
Artikkelin numero: gkag209
Vuosikerta: 54
Numero: 5
ISSN: 0305-1048
eISSN: 1362-4962
DOI: https://doi.org/10.1093/nar/gkag209
Julkaisun avoimuus kirjaamishetkellä: Avoimesti saatavilla
Julkaisukanavan avoimuus : Kokonaan avoin julkaisukanava
Verkko-osoite: https://doi.org/10.1093/nar/gkag209
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/516225679
Rinnakkaistallenteen lisenssi: CC BY
Rinnakkaistallennetun julkaisun versio: Kustantajan versio
State-of-the-art methods for inferring individual-level genetic ancestry are based on statistical models for haplotype data. Unfortunately, these methods are computationally demanding, making them impractical for biobank-scale analyses. In this paper, we describe KANN, an efficient k-nearest neighbor regression method for individual-level ancestry estimation with respect to predefined source populations using only principal components of genetic structure. Contrary to the existing tools that can only use reference samples with discrete source population assignment, KANN enables the use of reference samples with continuous ancestry profiles across multiple source populations. We observe that KANN’s ancestry estimates agree well with the haplotype-based method SOURCEFIND when estimating ancestry profiles across up to 10 Finnish source populations on a dataset of 18 125 Finnish samples from THL Biobank. In the 1000 Genomes Project data containing globally diverse genetic backgrounds, KANN produces highly similar results to the ADMIXTURE software. Based on our results, KANN is a promising tool for ancestry estimation in large-scale genomic studies.
Ladattava julkaisu This is an electronic reprint of the original article. |
Julkaisussa olevat rahoitustiedot:
This work was supported by the Sigrid Jusélius Foundation [8047 to M.P.] and the Research Council of Finland [338507, 352795, and 336285 to M.P.]. Funding to pay the Open Access publication charges for this article was provided by the Helsinki University Library.