A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä

Stochastic limited memory bundle algorithm for clustering in big data




TekijätKarmitsa, Napsu; Eronen, Ville-Pekka; Mäkelä, Marko M.; Pahikkala, Tapio; Airola, Antti

KustantajaElsevier BV

KustannuspaikkaLondon

Julkaisuvuosi2025

JournalPattern Recognition

Tietokannassa oleva lehden nimiPattern Recognition

Lehden akronyymiPATTERN RECOGN

Artikkelin numero111654

Vuosikerta165

Sivujen määrä13

ISSN0031-3203

eISSN1873-5142

DOIhttps://doi.org/10.1016/j.patcog.2025.111654

Verkko-osoitehttps://doi.org/10.1016/j.patcog.2025.111654

Rinnakkaistallenteen osoitehttps://research.utu.fi/converis/portal/detail/Publication/491806564


Tiivistelmä
Clustering is a crucial task in data mining and machine learning. In this paper, we propose an efficient algorithm, BIG-CLuST, for solving minimum sum-of-squares clustering problems in large and big datasets. We first develop a novel stochastic limited memory bundle algorithm (SLMBA) for large-scale nonsmooth finite-sum optimization problems and then formulate the clustering problem accordingly. The BIG-CLuST algorithm - a stochastic adaptation of the incremental clustering methodology - aims to find the global or a high-quality local solution for the clustering problem. It detects good starting points, i.e., initial cluster centers, for the SLMBA, applied as an underlying solver. We evaluate BIG-CLuST on several real-world datasets with numerous data points and features, comparing its performance with other clustering algorithms designed for large and big data. Numerical results demonstrate the efficiency of the proposed algorithm and the high quality of the found solutions on par with the best existing methods.

Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.




Julkaisussa olevat rahoitustiedot
The work was financially supported by the Research Council of Finland , Projects No. #345804 and #345805 led by Tapio Pahikkala and Antti Airola.


Last updated on 2025-13-05 at 13:45