TNEST: Training Sparse Neural Network for FPGA Based Edge Application - UTU Tutkimustietojärjestelmä

A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

TNEST: Training Sparse Neural Network for FPGA Based Edge Application

Tekijät: Das, Rammi; Karn, Rupesh Raj; Heikkonen, Jukka; Kanth, Rajeev

Toimittaja: Daimi Kevin, Al Sadoon Abeer

Konferenssin vakiintunut nimi: International Conference on Advances in Computing Research

Kustantaja: Springer Science and Business Media Deutschland GmbH

Kustannuspaikka: Cham

Julkaisuvuosi: 2024

Journal: Lecture notes in networks and systems

Kokoomateoksen nimi: Proceedings of the Second International Conference on Advances in Computing Research (ACR’24)

Tietokannassa oleva lehden nimi: Lecture Notes in Networks and Systems

Sarjan nimi: Lecture Notes in Networks and Systems

Vuosikerta: 956

Aloitussivu: 15

Lopetussivu: 28

ISBN: 978-3-031-56949-4

eISBN: 978-3-031-56950-0

ISSN: 2367-3370

eISSN: 2367-3389

DOI: https://doi.org/10.1007/978-3-031-56950-0_2

Verkko-osoite: https://doi.org/10.1007/978-3-031-56950-0_2

Tiivistelmä

Machine learning (ML) hardware inference has developed ultra-low-power edge devices that accelerate inferential applications performance. An FPGA (Field Programmable Gate Array) is a popular option for such systems. The FPGA has power budget, memory, compute resource, area, etc. constraints but possesses several key advantages, including bandwidth saving, speed, real-time inference, offline activity, etc. Neural networks have been extensively used in Edge systems due to their significant popularity in AI. Requirements of complex neural networks have only been recently realized for edge applications, so the research community has yet to develop a standard model for such applications. The sparse neural architecture reduces the active neurons and connections, making the entire system computationally more efficient and utilizing less memory. Such essential facts are more and more evident in edge-based IoT applications. In this work, we have customized neural network training algorithms to fit precisely for the Edge systems. Rather than a traditional top-down approach where MLs are fully trained and then pruned to fit within edge device resources, we adopted a generative approach where MLs are prepared with the least number of parameters, and further components are added as the need arises to improve inference accuracy. Our generative model shows significant savings in FPGA resource consumption compared to the top-down approach for the same precision.