A4 Refereed article in a conference publication
Design of Fault - Tolerant and Reliable Networks-on-Chip
Authors: Wang JS, Ebrahimi M, Huang L, Jantsch A, Li GJ
Editors: Aida Todri-Sanial, Patrick Girard, Giorgio Di Natale
Conference name: IEEE Computer Society Annual Symposium on VLSI
Publication year: 2015
Book title : 2015 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI
Journal name in source: 2015 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI
Journal acronym: IEEE COMPUT SOC
Series title: IEEE Computer Society Annual Symposium on VLSI
First page : 545
Last page: 550
Number of pages: 6
ISBN: 978-1-4799-8718-4
ISSN: 2159-3469
DOI: https://doi.org/10.1109/ISVLSI.2015.33
Abstract
Networks-on-Chips (NoCs) are at the core of high performance multi-processor systems-on-chips. As the number of cores and sub-systems on chip grows, the size and complexity of NoCs increase as well. Due to the process variation, aging effects and soft-errors in current and expected future process generations, the probability of failure in the NoCs rises and has to be fought at all levels: circuit, architecture, and communication protocols.This paper discusses appropriate fault models for NoCs and their effects on the architecture and network levels. A method to design fault-tolerant NoCs comprising of techniques at the link level, the routing level, and the end-to-end level of the communication is presented. In addition, the proposed method offers an isolation technique where the computing cores are decoupled from the faults in the network. This technique avoids or at least attenuates the severe impacts of faults on the network performance and functionality. These point techniques are combined together to design fault-tolerant and reliable NoCs.
Networks-on-Chips (NoCs) are at the core of high performance multi-processor systems-on-chips. As the number of cores and sub-systems on chip grows, the size and complexity of NoCs increase as well. Due to the process variation, aging effects and soft-errors in current and expected future process generations, the probability of failure in the NoCs rises and has to be fought at all levels: circuit, architecture, and communication protocols.This paper discusses appropriate fault models for NoCs and their effects on the architecture and network levels. A method to design fault-tolerant NoCs comprising of techniques at the link level, the routing level, and the end-to-end level of the communication is presented. In addition, the proposed method offers an isolation technique where the computing cores are decoupled from the faults in the network. This technique avoids or at least attenuates the severe impacts of faults on the network performance and functionality. These point techniques are combined together to design fault-tolerant and reliable NoCs.