Loading…

ANSA: Adaptive Near-Sensor Architecture for Dynamic DNN Processing in Compact Form Factors

Advanced edge sensing/computing devices, such as AR/VR devices, have a uniquely challenging adaptive baseline workload and camera sensor structure. These devices must process images in real-time from multiple sensors, placing a large burden on a typical centralized mobile SoC processor. Augmenting t...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2023-03, Vol.70 (3), p.1256-1269
Main Authors: Pinkham, Reid, Erhardt, Jack, De Salvo, Barbara, Berkovich, Andrew, Zhang, Zhengya
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Advanced edge sensing/computing devices, such as AR/VR devices, have a uniquely challenging adaptive baseline workload and camera sensor structure. These devices must process images in real-time from multiple sensors, placing a large burden on a typical centralized mobile SoC processor. Augmenting the sensors with a package-integrated near-sensor processor can improve the device's processing performance as well as reduce energy consumption. This near-sensor processor must adapt to the dynamic workloads, fit within a limited silicon footprint and energy envelope, and satisfy the real-time requirement. In this work, we present ANSA, a near-sensor processor architecture supporting flexible processing schemes and dataflows to maintain high efficiency for dynamic CNN workloads. ANSA is scalable to sub-mm2 sizes to match the footprint of advanced image sensors. ANSA supports module-level power gating to adapt the compute capacity to dynamic workloads. Finally, ANSA leverages recent advancements in high-density non-volatile memory and 3D packaging to support weight storage within the area constraints of an image sensor. Overall, ANSA achieves inference energy consumption up to 30\times lower than a standard SIMD baseline. Additionally, our design's scalability allows it to achieve up to 2.76\times lower average inference energy at 4.5\times lower silicon area compared to competing edge accelerator designs.
ISSN:1549-8328
1558-0806
DOI:10.1109/TCSI.2022.3228725