A Comparative Study of Deep Learning-based RGB-depth Fusion Methods for Object Detection - UTU Research Portal

A4 Refereed article in a conference publication

A Comparative Study of Deep Learning-based RGB-depth Fusion Methods for Object Detection

Authors: Farahnakian Fahimeh, Heikkonen Jukka

Editors: M. Arif Wani, Feng Luo, Xiaolin (Andy) Li, Dejing Dou, Francesco Bonchi

Conference name: International Conference on Machine Learning and Applications

Publication year: 2021

Book title : 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)

First page : 1475

Last page: 1482

ISBN: 978-1-7281-8471-5

eISBN: 978-1-7281-8470-8

DOI: https://doi.org/10.1109/ICMLA51294.2020.00228

Abstract

The object detection task which attempts to predict bounding boxes of all interest objects in an RGB image is of paramount importance for many real-world applications and has attracted much attention within the computer vision com- munity. However, RBG cameras cannot directly provide depth information and RGB-based object detector can not achieve an accurate performance under complex environment. To address this problem, we make two contributions in this paper. Firstly, the performances of four state-of-the art unsupervised depth estimation methods were thoroughly evaluated in the context of object detection, which can serve as a baseline for other researchers to develop even more sophisticated methods. Sec- ondly, we investigated whether fusing depth information and RGB can improve the performance of object detection networks. The obtained results on the KITTI dataset show that RGB-depth fusion approach with MonoDepth as depth estimation method outperforms the RGB-based and depth-based detectors.