Refereed article in conference proceedings (A4)
High-bandwidth on-chip communication architecture for general purpose computing
List of Authors: Forsell M, Leppänen V
Editors: Callaos Nagib, Lesso William, Palesi Maurizio
Publication year: 2005
Book title *: Proceedings of 9th World Multi-Conference on Systems, Cybernetics and Informatics, Volume IV
Journal name in source: WMSCI 2005: 9th World Multi-Conference on Systemics, Cybernetics and Informatics, Vol 4
Start page: 1
End page: 6
Number of pages: 6
ISBN: 980-6560-56-6
Abstract
Current on-chip communication architectures fail to support fine-grained general purpose computing due to lack of bandwidth, scalability and efficient synchronization schemes. In this paper we attack these problems by describing a double acyclic sparse mesh communication architecture featuring constant degree switches, fixed length intercommunication wiring, chip-wide synchronization wave scheme, and linear bandwidth scaling mechanism, being a part of our previously outlined ECLIPSE computing architecture. The network architecture is compared against the traditional mesh approach with simulations and VHDL modeling and shown to provide much higher bandwidth in random communication needed to realize general purpose high-performance computing engine while the silicon area overhead of the communication architecture remains at acceptable level. The performance of the architecture is verified with real parallel benchmarks.
Current on-chip communication architectures fail to support fine-grained general purpose computing due to lack of bandwidth, scalability and efficient synchronization schemes. In this paper we attack these problems by describing a double acyclic sparse mesh communication architecture featuring constant degree switches, fixed length intercommunication wiring, chip-wide synchronization wave scheme, and linear bandwidth scaling mechanism, being a part of our previously outlined ECLIPSE computing architecture. The network architecture is compared against the traditional mesh approach with simulations and VHDL modeling and shown to provide much higher bandwidth in random communication needed to realize general purpose high-performance computing engine while the silicon area overhead of the communication architecture remains at acceptable level. The performance of the architecture is verified with real parallel benchmarks.