A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
REPLICA MBTAC: multithreaded dual-mode processor
Tekijät: Martti Forsell, Jussi Roivainen, Ville Leppänen
Kustantaja: Springer
Julkaisuvuosi: 2018
Journal: Journal of Supercomputing
Vuosikerta: 74
Numero: 5
Aloitussivu: 1911
Lopetussivu: 1933
Sivujen määrä: 23
ISSN: 0920-8542
eISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-017-2199-z
Verkko-osoite: https://doi.org/10.1007/s11227-017-2199-z
Prevailing trend in design of chip multiprocessors (CMP) has been that
single-core processors are replicated. Therefore, they typically define
asynchronous computational model, require heavily locality-aware memory
allocation, and present high overheads in intercommunication. This kind
of properties make parallel programming very challenging and prone to
errors. We introduce our new dual-mode MultiBunched/Threaded
Architecture with Chaining (MBTAC) processor core, the main building
block of the REPLICA CMP. It provides a modern, sophisticated way for
writing general purpose parallel programs backed up by native execution
capabilities/realization of key concepts. These include support for
cost-efficient machine instruction-level synchronization and uniform
shared global memory for enabling easy-to-program memory allocation of
data structures and data movement. MBTAC makes use of low-overhead
thread-context switching solution; it has parallel computing savvy
functional unit organization to exploit inter-thread instruction-level
parallelism and highly efficient multioperations. To evaluate the
goodness of our proposal, we implemented three MBTAC constellations
featuring up to 2048 parallel threads on FPGA, compared it with respect
to DLX and Intel’s Core i7 processors. The results point toward high
performance in communication-intensive problems, simplified parallel
programmability, and regular, implementation-friendly structure.