ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications - UTU Tutkimustietojärjestelmä

A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications

Tekijät: Fu, Lei; Salimpour, Sahar; Militano, Leonardo; Edelman, Harry; Peña Queralta, Jorge; Toffetti, Giovanni

Toimittaja: N/A

Konferenssin vakiintunut nimi: International Conference on Robotic Computing and Communication

Julkaisuvuosi: 2025

Kokoomateoksen nimi: 2025 International Conference on Robotic Computing and Communication (RoboticCC)

Aloitussivu: 70

Lopetussivu: 77

ISBN: 979-8-3315-4966-4

eISBN: 979-8-3315-4965-7

DOI: https://doi.org/10.1109/RoboticCC68732.2025.00025

Julkaisun avoimuus kirjaamishetkellä: Ei avoimesti saatavilla

Julkaisukanavan avoimuus : Ei avoin julkaisukanava

Verkko-osoite: https://ieeexplore.ieee.org/document/11391939

Tiivistelmä

Agentic AI systems and Physical or Embodied AI systems have been two key research verticals at the forefront of Artificial Intelligence and Robotics, with Model Context Protocol (MCP) increasingly becoming a key component and enabler of agentic applications. However, the literature at the intersection of these verticals, i.e., Agentic Embodied AI, remains scarce. This paper introduces an MCP server for analyzing ROS and ROS2 bags, allowing for analyzing, visualizing and processing robot data with natural language through LLMs and VLMs. We describe specific tooling built with robotics domain knowledge, with our initial release focused on mobile robotics and supporting natively the analysis of trajectories, laser scan data, transforms, or time series data. This is in addition to providing an interface to standard ROS2 CLI tools (ros2 bag list or ros2 bag info), as well as the ability to filter bags with a subset of topics or trimmed in time. Coupled with the MCP server, we provide a lightweight UI that allows the benchmarking of the tooling with different LLMs, both proprietary (Anthropic, OpenAI) and open-source (through Groq). Our experimental results include the analysis of tool calling capabilities of eight different state-of-the-art LLM/VLM models, both proprietary and open-source, large and small. Our experiments indicate that there is a large divide in tool calling capabilities, with Kimi K2 and Claude Sonnet 4 demonstrating clearly superior performance. We also conclude that there are multiple factors affecting the success rates, from the tool description schema to the number of arguments, as well as the number of tools available to the models. The code is available with a permissive license at https://github.com/binabik-ai/mcp-rosbags.

Julkaisussa olevat rahoitustiedot:
Giovanni Toffetti acknowledges support from the SHIELD project, funded by Hasler Foundation, No. 2024-11-12-205.