General-Purpose Deep Learning Detection and Segmentation Models for Images from a Lidar-Based Camera Sensor - UTU Tutkimustietojärjestelmä

A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä

General-Purpose Deep Learning Detection and Segmentation Models for Images from a Lidar-Based Camera Sensor

Tekijät: Yu Xianjia, Salimpour Sahar, Peña Queralta Jorge, Westerlund Tomi

Kustantaja: MDPI

Kustannuspaikka: Basel

Julkaisuvuosi: 2023

Lehti: Sensors

Tietokannassa oleva lehden nimi: SENSORS

Lehden akronyymi: SENSORS-BASEL

Artikkelin numero: 2936

Vuosikerta: 23

Numero: 6

Sivujen määrä: 12

DOI: https://doi.org/10.3390/s23062936

Julkaisun avoimuus kirjaamishetkellä: Avoimesti saatavilla

Julkaisukanavan avoimuus : Kokonaan avoin julkaisukanava

Verkko-osoite: https://www.mdpi.com/1424-8220/23/6/2936

Rinnakkaistallenteen osoite: https://research.utu.fi/converis/portal/detail/Publication/179338592

Rinnakkaistallenteen lisenssi: CC BY

Rinnakkaistallennetun julkaisun versio: Kustantajan versio

Tiivistelmä

Over the last decade, robotic perception algorithms have significantly benefited from the rapid advances in deep learning (DL). Indeed, a significant amount of the autonomy stack of different commercial and research platforms relies on DL for situational awareness, especially vision sensors. This work explored the potential of general-purpose DL perception algorithms, specifically detection and segmentation neural networks, for processing image-like outputs of advanced lidar sensors. Rather than processing the three-dimensional point cloud data, this is, to the best of our knowledge, the first work to focus on low-resolution images with a 360 degrees field of view obtained with lidar sensors by encoding either depth, reflectivity, or near-infrared light in the image pixels. We showed that with adequate preprocessing, general-purpose DL models can process these images, opening the door to their usage in environmental conditions where vision sensors present inherent limitations. We provided both a qualitative and quantitative analysis of the performance of a variety of neural network architectures. We believe that using DL models built for visual cameras offers significant advantages due to their much wider availability and maturity compared to point cloud-based perception.

Ladattava julkaisu

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

sensors-23-02936.pdf