Loading…

Intermittent-Aware Neural Architecture Search

The increasing paradigm shift towards intermittent computing has made it possible to intermittently execute deep neural network (DNN) inference on edge devices powered by ambient energy. Recently, neural architecture search (NAS) techniques have achieved great success in automatically finding DNNs w...

Full description

Saved in:
Bibliographic Details
Published in:ACM transactions on embedded computing systems 2021-10, Vol.20 (5s), p.1-27, Article 64
Main Authors: Mendis, Hashan Roshantha, Kang, Chih-Kai, Hsiu, Pi-cheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing paradigm shift towards intermittent computing has made it possible to intermittently execute deep neural network (DNN) inference on edge devices powered by ambient energy. Recently, neural architecture search (NAS) techniques have achieved great success in automatically finding DNNs with high accuracy and low inference latency on the deployed hardware. We make a key observation, where NAS attempts to improve inference latency by primarily maximizing data reuse, but the derived solutions when deployed on intermittently-powered systems may be inefficient, such that the inference may not satisfy an end-to-end latency requirement and, more seriously, they may be unsafe given an insufficient energy budget. This work proposes iNAS, which introduces intermittent execution behavior into NAS to find accurate network architectures with corresponding execution designs, which can safely and efficiently execute under intermittent power. An intermittent-aware execution design explorer is presented, which finds the right balance between data reuse and the costs related to intermittent inference, and incorporates a preservation design search space into NAS, while ensuring the power-cycle energy budget is not exceeded. To assess an intermittent execution design, an intermittent-aware abstract performance model is presented, which formulates the key costs related to progress preservation and recovery during intermittent inference. We implement iNAS on top of an existing NAS framework and evaluate their respective solutions found for various datasets, energy budgets and latency requirements, on a Texas Instruments device. Compared to those NAS solutions that can safely complete the inference, the iNAS solutions reduce the intermittent inference latency by 60% on average while achieving comparable accuracy, with an average 7% increase in search overhead.
ISSN:1539-9087
1558-3465
DOI:10.1145/3476995