Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms




Taufique, Zain; Vyas, Aman; Miele, Antonio, Liljeberg, Pasi; Kanduri, Anil

N/A

IEEE International Conference on Computer Design

2024

Proceedings : IEEE International Conference on Computer Design

2024 IEEE 42nd International Conference on Computer Design (ICCD)

42

300

307

979-8-3503-8041-5

979-8-3503-8040-8

1063-6404

2576-6996

DOIhttps://doi.org/10.1109/ICCD63220.2024.00053

https://ieeexplore.ieee.org/document/10817997

https://research.utu.fi/converis/portal/detail/Publication/477606436



There is an increasing demand to run DNN applications on edge platforms for low-latency inference. Executing multi-DNN workloads with diverse compute and latency requirements on resource-constrained heterogeneous edge platforms poses a significant scheduling challenge. In this work, we present Tango framework for orchestrating multi-DNN inference on heterogeneous edge platforms. Our approach uses a Proximal Policy-based Reinforcement Learning agent to jointly optimize cluster selection, accuracy configuration, and frequency scaling to minimize inference latency with a tolerable accuracy loss. We implemented the proposed Tango framework as a portable middleware and deployed it on real hardware of the Jetson TX edge platform. Our evaluation against relevant multi-DNN scheduling strategies demonstrates 61 % lower latency and 48.4 % lower energy consumption at a maximum accuracy loss of 1.59 %.



This work is funded by the European Union’s Horizon 2020 Research and Innovation Program (APROPOS) under the Marie Curie grant No. 956090


Last updated on 2025-24-02 at 10:17