Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms
: Taufique, Zain; Vyas, Aman; Miele, Antonio, Liljeberg, Pasi; Kanduri, Anil
: N/A
: IEEE International Conference on Computer Design
: 2024
: Proceedings : IEEE International Conference on Computer Design
: 2024 IEEE 42nd International Conference on Computer Design (ICCD)
: 42
: 300
: 307
: 979-8-3503-8041-5
: 979-8-3503-8040-8
: 1063-6404
: 2576-6996
DOI: https://doi.org/10.1109/ICCD63220.2024.00053
: https://ieeexplore.ieee.org/document/10817997
: https://research.utu.fi/converis/portal/detail/Publication/477606436
There is an increasing demand to run DNN applications on edge platforms for low-latency inference. Executing multi-DNN workloads with diverse compute and latency requirements on resource-constrained heterogeneous edge platforms poses a significant scheduling challenge. In this work, we present Tango framework for orchestrating multi-DNN inference on heterogeneous edge platforms. Our approach uses a Proximal Policy-based Reinforcement Learning agent to jointly optimize cluster selection, accuracy configuration, and frequency scaling to minimize inference latency with a tolerable accuracy loss. We implemented the proposed Tango framework as a portable middleware and deployed it on real hardware of the Jetson TX edge platform. Our evaluation against relevant multi-DNN scheduling strategies demonstrates 61 % lower latency and 48.4 % lower energy consumption at a maximum accuracy loss of 1.59 %.
:
This work is funded by the European Union’s Horizon 2020 Research and Innovation Program (APROPOS) under the Marie Curie grant No. 956090