Loading…

Zero time waste in pre-trained early exit neural networks

The problem of reducing processing time of large deep learning models is a fundamental challenge in many real-world applications. Early exit methods strive towards this goal by attaching additional Internal Classifiers (ICs) to intermediate layers of a neural network. ICs can quickly return predicti...

Full description

Saved in:

Bibliographic Details
Published in:	Neural networks 2023-11, Vol.168, p.580-601
Main Authors:	Wójcik, Bartosz, Przewiȩźlikowski, Marcin, Szatkowski, Filip, Wołczyk, Maciej, Bałazy, Klaudia, Krzepkowski, Bartłomiej, Podolak, Igor, Tabor, Jacek, Śmieja, Marek, Trzciński, Tomasz
Format:	Article
Language:	English
Subjects:	Conditional computation Deep learning Dynamic neural networks Early-exiting networks Zero waste models
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The problem of reducing processing time of large deep learning models is a fundamental challenge in many real-world applications. Early exit methods strive towards this goal by attaching additional Internal Classifiers (ICs) to intermediate layers of a neural network. ICs can quickly return predictions for easy examples and, as a result, reduce the average inference time of the whole model. However, if a particular IC does not decide to return an answer early, its predictions are discarded, with its computations effectively being wasted. To solve this issue, we introduce Zero Time Waste (ZTW), a novel approach in which each IC reuses predictions returned by its predecessors by (1) adding direct connections between ICs and (2) combining previous outputs in an ensemble-like manner. We conduct extensive experiments across various multiple modes, datasets, and architectures to demonstrate that ZTW achieves a significantly better accuracy vs. inference time trade-off than other early exit methods. On the ImageNet dataset, it obtains superior results over the best baseline method in 11 out of 16 cases, reaching up to 5 percentage points of improvement on low computational budgets.
ISSN:	0893-6080 1879-2782
DOI:	10.1016/j.neunet.2023.10.003