Loading…

14.1 A 510nW 0.41V Low-Memory Low-Computation Keyword-Spotting Chip Using Serial FFT-Based MFCC and Binarized Depthwise Separable Convolutional Neural Network in 28nm CMOS

Ultra-low power is a strong requirement for always-on speech interfaces in wearable and mobile devices, such as Voice Activity Detection (VAD) and Keyword Spotting (KWS) [1]-[5]. A KWS system is used to detect specific wake-up words by speakers and has to be always on. Previous ASICs for KWS lack en...

Full description

Saved in:
Bibliographic Details
Main Authors: Shan, Weiwei, Yang, Minhao, Xu, Jiaming, Lu, Yicheng, Zhang, Shuai, Wang, Tao, Yang, Jun, Shi, Longxing, Seok, Mingoo
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Ultra-low power is a strong requirement for always-on speech interfaces in wearable and mobile devices, such as Voice Activity Detection (VAD) and Keyword Spotting (KWS) [1]-[5]. A KWS system is used to detect specific wake-up words by speakers and has to be always on. Previous ASICs for KWS lack energy-efficient implementations having power < 5\mu \mathrm{W} . For example, deep neural network (DNN)-based KWS [1] has a large on-chip weight memory of 270KB and consumes 288\mu \mathrm{W} . A binarized convolutional neural network (CNN) used 52KB of SRAM, 141\mu \mathrm{W} wakeup power at 2.5MHz, 0.57V [2]. An LSTM-based SoC used 105KB of SRAM and reduced power to 16.11\mu\mathrm{W} for KWS with 90.8% accuracy on the Google Speech Command Dataset (GSCD) [3]. Laika reduced power to 5\mu \mathrm{W} [4], not including the Mel Frequency Cepstrum Coefficient (MFCC) circuit. High compute and memory requirements have prevented always-on KWS chips from operating in the \mathrm{sub}-\mu \mathrm{W} range.
ISSN:2376-8606
DOI:10.1109/ISSCC19947.2020.9063000