A1 Refereed original research article in a scientific journal

Enhancing the Resilience of ROS 2-Based Multi-Robot Systems with Kubernetes: A Case Study on UWB-Based Relative Positioning




AuthorsZhang, Jiaqiang; Yu, Xianjia; Westerlund, Tomi

PublisherMDPI

Publishing placeBASEL

Publication year2025

JournalSensors

Journal name in sourceSENSORS

Journal acronymSENSORS-BASEL

Article number5067

Volume25

Issue16

Number of pages12

eISSN1424-8220

DOIhttps://doi.org/10.3390/s25165067

Web address https://doi.org/10.3390/s25165067

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/499836523


Abstract
ROS (Robot Operating System) has become the de facto standard in robotics research and development, with ROS 2, in particular, offering enhanced support for real-time communication, distributed systems, and scalable multi-robot applications. These capabilities have driven its widespread adoption across academia, industry, and the open-source community. However, deploying ROS 2 applications across heterogeneous hardware platforms remains a complex task-especially in scenarios that require tightly coordinated, multi-agent systems. In such cases, the failure of a single agent can propagate disruptions throughout the system. A representative example is Ultra-wideband (UWB)-based multi-robot relative localization, where inter-robot dependencies are essential for maintaining accurate relative positioning. While Kubernetes offers powerful features for automated deployment and orchestration, its integration with ROS 2 has not yet been thoroughly evaluated within the context of specific robotic applications. This paper addresses this gap by integrating Kubernetes with ROS 2 in a UWB-based multi-robot localization system, using UWB ranging error mitigation as a representative application. An edge cluster comprising five NVIDIA Jetson Nano devices and one laptop is orchestrated using Kubernetes, with a Jetson Nano node mounted on each robot. We deploy Long Short-Term Memory (LSTM)-based error mitigation modules on the edge nodes and systematically induce failures in various combinations of these modules. The system's resilience and robustness are then assessed by analyzing position errors under different failure scenarios.

Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.




Funding information in the publication
This work was supported by the R3Swarms project funded by the Secure Systems Research Center (SSRC), Technology Innovation Institute (TII).


Last updated on 2025-10-09 at 11:14