Testing human-hand segmentation on in-distribution and out-of-distribution data in human–robot interactions using a deep ensemble model - UTU Research Portal

A1 Refereed original research article in a scientific journal

Testing human-hand segmentation on in-distribution and out-of-distribution data in human–robot interactions using a deep ensemble model

Authors: Jalayer, Reza; Chen, Yuxin; Jalayer, Masoud; Orsenigo, Carlotta; Tomizuka, Masayoshi

Publisher: Elsevier BV

Publication year: 2025

Journal: Mechatronics

Journal name in source: Mechatronics

Article number: 103365

Volume: 110

ISSN: 0957-4158

DOI: https://doi.org/10.1016/j.mechatronics.2025.103365

Publication's open availability at the time of reporting: Open Access

Publication channel's open availability : Partially Open Access publication channel

Web address : https://www.sciencedirect.com/science/article/pii/S0957415825000741

Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/499135968

Abstract

Reliable detection and segmentation of human hands are critical for enhancing safety and facilitating advanced interactions in human–robot collaboration. Current research predominantly evaluates hand segmentation under in-distribution (ID) data, which reflects the training data of deep learning (DL) models. However, this approach fails to address out-of-distribution (OOD) scenarios that often arise in real-world human–robot interactions. In this work, we make three key contributions: first we assess the generalization of deep learning (DL) models for hand segmentation under both ID and OOD scenarios, utilizing a newly collected industrial dataset that captures a wide range of real-world conditions including simple and cluttered backgrounds with industrial tools, varying numbers of hands (0 to 4), gloves, rare gestures, and motion blur. Our second contribution is considering both egocentric and static viewpoints. We evaluated the models trained on four datasets, i.e. EgoHands, Ego2Hands (egocentric mobile camera), HADR, and HAGS (static fixed viewpoint) by testing them with both egocentric (head-mounted) and static cameras, enabling robustness evaluation from multiple points of view. Our third contribution is introducing an uncertainty analysis pipeline based on the predictive entropy of predicted hand pixels. This procedure enables flagging unreliable segmentation outputs by applying thresholds established in the validation phase. This enables automatic identification and filtering of untrustworthy predictions, significantly improving segmentation reliability in OOD scenarios. For segmentation, we used a deep ensemble model composed of UNet and RefineNet as base learners. Our experiments demonstrate that models trained on industrial datasets (HADR, HAGS) outperform those trained on non-industrial datasets, both in segmentation accuracy and in their ability to flag unreliable outputs via uncertainty estimation. These findings underscore the necessity of domain-specific training data and show that our uncertainty analysis pipeline can provide a practical safety layer for real-world deployment.

Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

1-s2.0-S0957415825000741-main.pdf