Loading…

CATS: Combined Activation and Temporal Suppression for Efficient Network Inference

Brain-inspired event-driven processors execute deep neural networks (DNNs) in a sparsity-aware manner, leading to superior performance compared to conventional platforms. In the pursuit of higher event sparsity, prior studies suppress non-zero events by either eliminating the intra-frame activations...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhu, Zeqi, Pourtaherian, Arash, Waeijen, Luc, Akkaya, Ibrahim Batuhan, Bondarev, Egor, Moreira, Orlando
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Brain-inspired event-driven processors execute deep neural networks (DNNs) in a sparsity-aware manner, leading to superior performance compared to conventional platforms. In the pursuit of higher event sparsity, prior studies suppress non-zero events by either eliminating the intra-frame activations (spatially) or leveraging the redundancy in the inter-frame differences for a video (temporally). However, we have empirically observed that simultaneously enhancing activation and temporal sparsity can lead to a synergistic suppression outcome. To this end, we propose an end-to-end event suppression training approach CATS −− Combined Activation and Temporal Suppression for efficient network inference. It utilizes a gradient-based method to search for the optimal temporal thresholds per layer while penalizing the presence of events in both spatial and temporal domains. Our experimental results show that CATS achieves 2 ∼ 6× higher event suppression compared to the inherent ReLU suppression across a wide range of vision applications, consistently outperforming the state-of-the-art (SOTA) methods by a significant margin at all accuracy levels. Furthermore, a case study on the commercial event-driven processor GrAI-VIP highlights that the induced event sparsity in SSD on the EgoHands dataset can be efficiently translated into a performance enhancement of 2.5× in FPS, 2.1× in latency, and 3.8× in energy consumption, while maintaining the model accuracy.
ISSN:2642-9381
DOI:10.1109/WACV57701.2024.00798