A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
Tolerating transient illegal turn faults in NoCs
Tekijät: Huang LT, Zhang XF, Ebrahimi M, Li GJ
Kustantaja: ELSEVIER SCIENCE BV
Julkaisuvuosi: 2016
Journal: Microprocessors and Microsystems
Tietokannassa oleva lehden nimi: MICROPROCESSORS AND MICROSYSTEMS
Lehden akronyymi: MICROPROCESS MICROSY
Vuosikerta: 43
Numero: SI
Aloitussivu: 104
Lopetussivu: 115
Sivujen määrä: 12
ISSN: 0141-9331
eISSN: 1872-9436
DOI: https://doi.org/10.1016/j.micpro.2016.01.016
Tiivistelmä
Network-on-Chip (NoC) is becoming a competitive solution to connect hundreds of processing elements in modern computing platforms. Under the trend of shrinking feature sizes, circuits are likely to suffer from faults which lead to degraded performance and erroneous behaviour. Compared to permanent faults, transient faults happen even more frequently and seriously while they are hidden within complex on chip behaviours. One of the serious consequences caused by transient faults is taking illegal turns by the packets after the damage of control logic in on-chip routers which may lead to a deadlock situation and eventually crashing the entire system. To avoid this situation, in this paper, we propose a comprehensive scheme called ODT including an improved router architecture, an illegal-turn-resilient routing algorithm, online fault-detect units and a fault classification method. By applying ODT, more turns are supported on routing level and the deadlock situations can be significantly reduced. Experimental results indicate up to 22% increase of the survived packets in the network when 4% of routing computation units in failure. The extra area overhead and power consumption of ODT method is around 9.22% and 9.63%. (C) 2016 Elsevier B.V. All rights reserved.
Network-on-Chip (NoC) is becoming a competitive solution to connect hundreds of processing elements in modern computing platforms. Under the trend of shrinking feature sizes, circuits are likely to suffer from faults which lead to degraded performance and erroneous behaviour. Compared to permanent faults, transient faults happen even more frequently and seriously while they are hidden within complex on chip behaviours. One of the serious consequences caused by transient faults is taking illegal turns by the packets after the damage of control logic in on-chip routers which may lead to a deadlock situation and eventually crashing the entire system. To avoid this situation, in this paper, we propose a comprehensive scheme called ODT including an improved router architecture, an illegal-turn-resilient routing algorithm, online fault-detect units and a fault classification method. By applying ODT, more turns are supported on routing level and the deadlock situations can be significantly reduced. Experimental results indicate up to 22% increase of the survived packets in the network when 4% of routing computation units in failure. The extra area overhead and power consumption of ODT method is around 9.22% and 9.63%. (C) 2016 Elsevier B.V. All rights reserved.