A1 Refereed original research article in a scientific journal

Coralysis enables sensitive identification of imbalanced cell types and states in single-cell data via multi-level integration




AuthorsSousa, António G. G.; Smolander, Johannes; Junttila, Sini; Elo, Laura L.

PublisherOxford University Press (OUP)

Publication year2025

Journal: Nucleic Acids Research

Article numbergkaf1128

Volume53

Issue21

ISSN0305-1048

eISSN1362-4962

DOIhttps://doi.org/10.1093/nar/gkaf1128

Publication's open availability at the time of reportingOpen Access

Publication channel's open availability Open Access publication channel

Web address https://doi.org/10.1093/nar/gkaf1128

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/505369787


Abstract

Complex single-cell analyses now routinely integrate multiple datasets, followed by cell-type annotation and differential expression analysis. Current state-of-the-art integration methods often struggle with imbalanced cell types across datasets particularly when highly similar but distinct cell types are not present in all datasets. Inaccurate integration leads to incorrect annotations, affecting downstream analyses such as differential expression. To streamline single-cell data analysis, we introduce Coralysis, an all-in-one package featuring a sensitive integration algorithm, reference-mapping for accurate automatic annotation, and fine-grained cell-state identification. We demonstrate that Coralysis shows consistently high performance across diverse integration tasks, outperforming state-of-the-art methods particularly in challenging settings when similar cell types are imbalanced or missing. It accurately predicts cell-type identities across various annotation scenarios. A key strength of Coralysis is its ability to provide cell-specific probability scores, enabling the identification of transient and stable cell-states, along with their differential expression patterns. Importantly, Coralysis performs robustly on different types of single-cell data from transcriptomics to proteomics. Overall, Coralysis includes all the main steps of single-cell data analysis; it preserves subtle biological variation by improving the integration and annotation of imbalanced cell types, and identifies fine-grained cell-states—enabling a faithful analysis of the cellular landscape in complex single-cell experiments.


Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.




Funding information in the publication
A.G.G.S. and L.L.E. were supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no.: 955321. A.G.G.S. was also supported by the University of Turku, Åbo Akademi University, Turku Graduate School (UTUGS). L.L.E. reports grants from Research Council of Finland (310561, 329278, 335434, 335611, 341342, and 364700), Sigrid Juselius Foundation, and Cancer Foundation Finland during the conduct of the study. Our research is also supported by UTUGS, Biocenter Finland, and ELIXIR Finland. Funding to pay the Open Access publication charges for this article was provided by European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no.: 955321.


Last updated on 2025-26-11 at 15:21