A4 Refereed article in a conference publication

Prototyping the MBTAC processor for the REPLICA CMP




AuthorsForsell Martti, Roivainen Jussi, Leppänen Ville

EditorsManish Parashar

Conference nameIEEE international parallel and distributed processing symposium workshops

Publication year2014

Book title Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International

First page 709

Last page716

Number of pages8

ISBN978-1-4799-4117-9

eISBN978-1-4799-4116-2

DOIhttps://doi.org/10.1109/IPDPSW.2014.82(external)


Abstract

Current chip multiprocessors (CMP) have mostly

been designed by replicating sequential/single core processors

and providing some support for operating them with a shared

memory. As a result of this, they define asynchronous computational

model of threads, often require maximizing the locality

of memory references to get decent performance, and feature

high intercommunication overheads, that make parallel

programming tedious for general purpose functionalities.

Most of these problems can be eliminated by designing the

processors architecture for scalable general purpose computing

from the very beginning like done in processors for configurable

emulated shared memory (CESM) CMPs. They provide

support for machine instruction-level synchronization,

make use of multithreading to support latency-insensitive

computation, and promote the concept of uniform synchronous

shared memory for easy variable allocation and convenient

data exchange. In our earlier work we have proposed the

first CESM architecture TOTAL ECLIPSE composed of early

MBTAC processors making use of very low-overhead multithreading,

parallel computing savvy functional unit organization,

support for fast synchronization between the instructions

and threads, and highly efficient multioperations.

Unfortunately, certain key parts of these processors turned

out to be hardly implementable and overall they lacked support

for ordered multiprefix operations and full configurability

of the CESM scheme. In this paper we introduce a new

fully configurable version of the MBTAC processor for our

new REPLICA CESM architecture and the first FPGA implementations

of it. To evaluate it, we execute short test programs

on it and compare it preliminary against Intel Core i7 and

DLX processors. Our FPGA design flow and testing approach

are described.




Last updated on 2024-26-11 at 15:08