A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
Super Level Sets and Exponential Decay: A Synergistic Approach to Stable Neural Network Training
Tekijät: Chaudary, Jatin; Nidhi, Dipak; Heikkonen, Jukka; Merisaari, Harri; Kanth, Rajiv
Kustantaja: AI Access Foundation
Julkaisuvuosi: 2025
Journal: Journal of Artificial Intelligence Research
Tietokannassa oleva lehden nimi: Journal of Artificial Intelligence Research
Artikkelin numero: 21
Vuosikerta: 83
ISSN: 1076-9757
eISSN: 1943-5037
DOI: https://doi.org/10.1613/jair.1.17272
Verkko-osoite: https://doi.org/10.1613/jair.1.17272
Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/499842072
This paper presents a theoretically grounded optimization framework for neural network training that integrates an Exponentially Decaying Learning Rate with Lyapunov-based stability analysis. We develop a dynamic learning rate algorithm and prove that it induces connected and stable descent paths through the loss landscape by maintaining the connectivity of super-level sets 𝑆𝜆={𝜃∈R𝑛:L(𝜃) ≥𝜆}. Under the condition that the Lyapunov function 𝑉(𝜃)=L(𝜃)satisfies∇𝑉(𝜃)·∇L(𝜃) ≥0, we establish that these super-level sets are not only connected but also equiconnected across epochs, providing uniform topological stability. We further derive convergence guarantees using a second-order Taylor expansion and demonstrate that our exponentially scheduled learning rate with gradient-based modulation leads to a monotonic decrease in loss. The proposed algorithm incorporates this schedule into a stability-aware update mechanism that adapts step sizes based on both curvature and energy-level geometry. This work formalizes the role of topological structure in convergence dynamics and introduces a provably stable optimization algorithm for high-dimensional, non-convex neural networks.
Ladattava julkaisu This is an electronic reprint of the original article. |
Julkaisussa olevat rahoitustiedot:
Jatin Chaudhary would like to acknowledge the University of Turku Graduate School’s grant for conducting thiswork.