A Comparative Study of Deep Learning-based RGB-depth Fusion Methods for Object Detection
: Farahnakian Fahimeh, Heikkonen Jukka
: M. Arif Wani, Feng Luo, Xiaolin (Andy) Li, Dejing Dou, Francesco Bonchi
: International Conference on Machine Learning and Applications
: 2021
: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)
: 1475
: 1482
: 978-1-7281-8471-5
: 978-1-7281-8470-8
DOI: https://doi.org/10.1109/ICMLA51294.2020.00228
The object detection task which attempts to predict bounding boxes of all interest objects in an RGB image is of paramount importance for many real-world applications and has attracted much attention within the computer vision com- munity. However, RBG cameras cannot directly provide depth information and RGB-based object detector can not achieve an accurate performance under complex environment. To address this problem, we make two contributions in this paper. Firstly, the performances of four state-of-the art unsupervised depth estimation methods were thoroughly evaluated in the context of object detection, which can serve as a baseline for other researchers to develop even more sophisticated methods. Sec- ondly, we investigated whether fusing depth information and RGB can improve the performance of object detection networks. The obtained results on the KITTI dataset show that RGB-depth fusion approach with MonoDepth as depth estimation method outperforms the RGB-based and depth-based detectors.