Loading…

A 1 GHz Hardware Loop-Accelerator With Razor-Based Dynamic Adaptation for Energy-Efficient Operation

Dynamic adaptation using Razor-based detection and correction of timing errors has demonstrated substantial improvements in performance and energy-efficiency in microprocessors. In this work, we apply Razor to hardware accelerators that find increasing application in System-on-Chip designs with high...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2014-08, Vol.61 (8), p.2290-2298
Main Authors: Das, Shidhartha, Dasika, Ganesh S., Shivashankar, Karthik, Bull, David
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dynamic adaptation using Razor-based detection and correction of timing errors has demonstrated substantial improvements in performance and energy-efficiency in microprocessors. In this work, we apply Razor to hardware accelerators that find increasing application in System-on-Chip designs with high-performance requirements that must be delivered under stringent power budgets. We describe the implementation and silicon measurement results from a Razor-based hardware loop-accelerator (RZLA), implementing the Sobel edge-detection algorithm. Unlike in microprocessors, the RZLA pipeline is datapath-dominated with statically-scheduled control that has queue-based storage structures which are simply extended to support check-pointing and recovery. We exploit these characteristics typical of DSP and image-processing accelerators to implement Razor recovery in manner that is amenable to RTL validation and verification. We show a low-overhead pulsed-latch based Razor Flip-flop (RFF) architecture that adds only a single extra transistor on clock to minimize clock power overhead. The RFF is deployed in conjunction with a level-sensitive latch-insertion based algorithm to address the minimum-delay constraint present in all Razor systems. This algorithm enables the use of 50% of the clock period for timing speculation leading to robust error detection and correction across a wide dynamic voltage- and frequency-scaling range. Fabricated in 65 nm CMOS, the RZLA reclaims voltage margins to demonstrate 34% energy-efficiency improvements on a per-device basis and 33% overall, for the entire batch of devices at 1 GHz operation.
ISSN:1549-8328
1558-0806
DOI:10.1109/TCSI.2014.2333332