A1 Refereed original research article in a scientific journal
Enhancing the Resilience of ROS 2-Based Multi-Robot Systems with Kubernetes: A Case Study on UWB-Based Relative Positioning
Authors: Zhang, Jiaqiang; Yu, Xianjia; Westerlund, Tomi
Publisher: MDPI
Publishing place: BASEL
Publication year: 2025
Journal: Sensors
Journal name in source: SENSORS
Journal acronym: SENSORS-BASEL
Article number: 5067
Volume: 25
Issue: 16
Number of pages: 12
eISSN: 1424-8220
DOI: https://doi.org/10.3390/s25165067
Web address : https://doi.org/10.3390/s25165067
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/499836523
ROS (Robot Operating System) has become the de facto standard in robotics research and development, with ROS 2, in particular, offering enhanced support for real-time communication, distributed systems, and scalable multi-robot applications. These capabilities have driven its widespread adoption across academia, industry, and the open-source community. However, deploying ROS 2 applications across heterogeneous hardware platforms remains a complex task-especially in scenarios that require tightly coordinated, multi-agent systems. In such cases, the failure of a single agent can propagate disruptions throughout the system. A representative example is Ultra-wideband (UWB)-based multi-robot relative localization, where inter-robot dependencies are essential for maintaining accurate relative positioning. While Kubernetes offers powerful features for automated deployment and orchestration, its integration with ROS 2 has not yet been thoroughly evaluated within the context of specific robotic applications. This paper addresses this gap by integrating Kubernetes with ROS 2 in a UWB-based multi-robot localization system, using UWB ranging error mitigation as a representative application. An edge cluster comprising five NVIDIA Jetson Nano devices and one laptop is orchestrated using Kubernetes, with a Jetson Nano node mounted on each robot. We deploy Long Short-Term Memory (LSTM)-based error mitigation modules on the edge nodes and systematically induce failures in various combinations of these modules. The system's resilience and robustness are then assessed by analyzing position errors under different failure scenarios.
Downloadable publication This is an electronic reprint of the original article. |
Funding information in the publication:
This work was supported by the R3Swarms project funded by the Secure Systems Research Center (SSRC), Technology Innovation Institute (TII).