A4 Article in conference proceedings

A Comparative Study of Deep Learning-based RGB-depth Fusion Methods for Object Detection

List of Authors: Farahnakian Fahimeh, Heikkonen Jukka

Conference name: International Conference on Machine Learning and Applications

Publication year: 2021

Book title *: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)

ISBN: 978-1-7281-8471-5

eISBN: 978-1-7281-8470-8

DOI: http://dx.doi.org/10.1109/ICMLA51294.2020.00228


The object detection task which attempts to predict bounding boxes of all interest objects in an RGB image is of paramount importance for many real-world applications and has attracted much attention within the computer vision com- munity. However, RBG cameras cannot directly provide depth information and RGB-based object detector can not achieve an accurate performance under complex environment. To address this problem, we make two contributions in this paper. Firstly, the performances of four state-of-the art unsupervised depth estimation methods were thoroughly evaluated in the context of object detection, which can serve as a baseline for other researchers to develop even more sophisticated methods. Sec- ondly, we investigated whether fusing depth information and RGB can improve the performance of object detection networks. The obtained results on the KITTI dataset show that RGB-depth fusion approach with MonoDepth as depth estimation method outperforms the RGB-based and depth-based detectors.

Last updated on 2021-03-12 at 13:13