Loading…
ANSA: Adaptive Near-Sensor Architecture for Dynamic DNN Processing in Compact Form Factors
Advanced edge sensing/computing devices, such as AR/VR devices, have a uniquely challenging adaptive baseline workload and camera sensor structure. These devices must process images in real-time from multiple sensors, placing a large burden on a typical centralized mobile SoC processor. Augmenting t...
Saved in:
Published in: | IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2023-03, Vol.70 (3), p.1256-1269 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Advanced edge sensing/computing devices, such as AR/VR devices, have a uniquely challenging adaptive baseline workload and camera sensor structure. These devices must process images in real-time from multiple sensors, placing a large burden on a typical centralized mobile SoC processor. Augmenting the sensors with a package-integrated near-sensor processor can improve the device's processing performance as well as reduce energy consumption. This near-sensor processor must adapt to the dynamic workloads, fit within a limited silicon footprint and energy envelope, and satisfy the real-time requirement. In this work, we present ANSA, a near-sensor processor architecture supporting flexible processing schemes and dataflows to maintain high efficiency for dynamic CNN workloads. ANSA is scalable to sub-mm2 sizes to match the footprint of advanced image sensors. ANSA supports module-level power gating to adapt the compute capacity to dynamic workloads. Finally, ANSA leverages recent advancements in high-density non-volatile memory and 3D packaging to support weight storage within the area constraints of an image sensor. Overall, ANSA achieves inference energy consumption up to 30\times lower than a standard SIMD baseline. Additionally, our design's scalability allows it to achieve up to 2.76\times lower average inference energy at 4.5\times lower silicon area compared to competing edge accelerator designs. |
---|---|
ISSN: | 1549-8328 1558-0806 |
DOI: | 10.1109/TCSI.2022.3228725 |