Refereed article in conference proceedings (A4)
A Comparative Study of Deep Learning-based RGB-depth Fusion Methods for Object Detection
List of Authors: Farahnakian Fahimeh, Heikkonen Jukka
Editors: M. Arif Wani, Feng Luo, Xiaolin (Andy) Li, Dejing Dou, Francesco Bonchi
Conference name: International Conference on Machine Learning and Applications
Publication year: 2021
Book title *: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)
Start page: 1475
End page: 1482
ISBN: 978-1-7281-8471-5
eISBN: 978-1-7281-8470-8
DOI: http://dx.doi.org/10.1109/ICMLA51294.2020.00228
The object detection task which attempts to predict bounding boxes of all interest objects in an RGB image is of paramount importance for many real-world applications and has attracted much attention within the computer vision com- munity. However, RBG cameras cannot directly provide depth information and RGB-based object detector can not achieve an accurate performance under complex environment. To address this problem, we make two contributions in this paper. Firstly, the performances of four state-of-the art unsupervised depth estimation methods were thoroughly evaluated in the context of object detection, which can serve as a baseline for other researchers to develop even more sophisticated methods. Sec- ondly, we investigated whether fusing depth information and RGB can improve the performance of object detection networks. The obtained results on the KITTI dataset show that RGB-depth fusion approach with MonoDepth as depth estimation method outperforms the RGB-based and depth-based detectors.