A4 Refereed article in a conference publication

Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms




AuthorsTaufique, Zain; Vyas, Aman; Miele, Antonio, Liljeberg, Pasi; Kanduri, Anil

EditorsN/A

Conference nameIEEE International Conference on Computer Design

Publication year2024

JournalProceedings : IEEE International Conference on Computer Design

Book title 2024 IEEE 42nd International Conference on Computer Design (ICCD)

Volume42

First page 300

Last page307

ISBN979-8-3503-8041-5

eISBN979-8-3503-8040-8

ISSN1063-6404

eISSN2576-6996

DOIhttps://doi.org/10.1109/ICCD63220.2024.00053

Web address https://ieeexplore.ieee.org/document/10817997

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/477606436


Abstract

There is an increasing demand to run DNN applications on edge platforms for low-latency inference. Executing multi-DNN workloads with diverse compute and latency requirements on resource-constrained heterogeneous edge platforms poses a significant scheduling challenge. In this work, we present Tango framework for orchestrating multi-DNN inference on heterogeneous edge platforms. Our approach uses a Proximal Policy-based Reinforcement Learning agent to jointly optimize cluster selection, accuracy configuration, and frequency scaling to minimize inference latency with a tolerable accuracy loss. We implemented the proposed Tango framework as a portable middleware and deployed it on real hardware of the Jetson TX edge platform. Our evaluation against relevant multi-DNN scheduling strategies demonstrates 61 % lower latency and 48.4 % lower energy consumption at a maximum accuracy loss of 1.59 %.


Funding information in the publication
This work is funded by the European Union’s Horizon 2020 Research and Innovation Program (APROPOS) under the Marie Curie grant No. 956090


Last updated on 2025-24-02 at 10:17