Cost of Bandwidth-Optimized Sparse Mesh Layouts - UTU Tutkimustietojärjestelmä

A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

Cost of Bandwidth-Optimized Sparse Mesh Layouts

Tekijät: Forsell Martti, Leppänen Ville, Penttonen Martti

Toimittaja: Victor Malyshkin

Konferenssin vakiintunut nimi: International conference on parallel computing technologie

Julkaisuvuosi: 2015

Kokoomateoksen nimi: Parallel Computing Technologies - 13th International Conference, PaCT 2015, Petrozavodsk, Russia, August 31 - September 4, 2015, Proceedings

Sarjan nimi: Lecture Notes in Computer Science

Vuosikerta: 9251

Aloitussivu: 375

Lopetussivu: 389

Sivujen määrä: 15

ISBN: 978-3-319-21908-0

ISSN: 0302-9743

DOI: https://doi.org/10.1007/978-3-319-21909-7_37

Tiivistelmä

The requirements of interconnection networks for shared memory

chip multiprocessors (CMP) differ from those used in traditional

application-specific networks on chip (NOC). This is because modern

CMP cores tend to inject memory references to the network frequently

(up to once per clock cycle) and the latency of references should be as

low as possible. The throughput computing paradigm is a mechanism to

trade the low latency requirement to high throughput in CMPs by overlapping

memory references from processors with a help of multithreading.

To meet the bandwidth requirements of throughput computing CMPs we

have studied using d-dimensional sparse meshes and tori. Unfortunately

it has turned out that either there is too much bandwidth leading to high

silicon area and energy consumption of the links get longer decreasing

the clock rate. In this paper we study the cost of bandwidth-optimized

2-dimensional meshes and tori for CMPs using the throughput computing

paradigm. We present the layout as well as determine link length,

degree of node and compare them to those of d-dimensional meshes and

tori. For area and power efficiency considerations, we also give estimates

on silicon area and power consumption.