A1 Refereed original research article in a scientific journal
Coralysis enables sensitive identification of imbalanced cell types and states in single-cell data via multi-level integration
Authors: Sousa, António G. G.; Smolander, Johannes; Junttila, Sini; Elo, Laura L.
Publisher: Oxford University Press (OUP)
Publication year: 2025
Journal: Nucleic Acids Research
Article number: gkaf1128
Volume: 53
Issue: 21
ISSN: 0305-1048
eISSN: 1362-4962
DOI: https://doi.org/10.1093/nar/gkaf1128
Publication's open availability at the time of reporting: Open Access
Publication channel's open availability : Open Access publication channel
Web address : https://doi.org/10.1093/nar/gkaf1128
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/505369787
Complex single-cell analyses now routinely integrate multiple datasets, followed by cell-type annotation and differential expression analysis. Current state-of-the-art integration methods often struggle with imbalanced cell types across datasets particularly when highly similar but distinct cell types are not present in all datasets. Inaccurate integration leads to incorrect annotations, affecting downstream analyses such as differential expression. To streamline single-cell data analysis, we introduce Coralysis, an all-in-one package featuring a sensitive integration algorithm, reference-mapping for accurate automatic annotation, and fine-grained cell-state identification. We demonstrate that Coralysis shows consistently high performance across diverse integration tasks, outperforming state-of-the-art methods particularly in challenging settings when similar cell types are imbalanced or missing. It accurately predicts cell-type identities across various annotation scenarios. A key strength of Coralysis is its ability to provide cell-specific probability scores, enabling the identification of transient and stable cell-states, along with their differential expression patterns. Importantly, Coralysis performs robustly on different types of single-cell data from transcriptomics to proteomics. Overall, Coralysis includes all the main steps of single-cell data analysis; it preserves subtle biological variation by improving the integration and annotation of imbalanced cell types, and identifies fine-grained cell-states—enabling a faithful analysis of the cellular landscape in complex single-cell experiments.
Downloadable publication This is an electronic reprint of the original article. |
Funding information in the publication:
A.G.G.S. and L.L.E. were supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no.: 955321. A.G.G.S. was also supported by the University of Turku, Åbo Akademi University, Turku Graduate School (UTUGS). L.L.E. reports grants from Research Council of Finland (310561, 329278, 335434, 335611, 341342, and 364700), Sigrid Juselius Foundation, and Cancer Foundation Finland during the conduct of the study. Our research is also supported by UTUGS, Biocenter Finland, and ELIXIR Finland. Funding to pay the Open Access publication charges for this article was provided by European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no.: 955321.