Loading…

SplitEE: Early Exit in Deep Neural Networks with Split Computing

Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To overcome the issue, various approaches are considered, like of...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2023-09
Main Authors:	Bajpai, Divya J, Trivedi, Vivek K, Yadav, Sohan L, Hanawal, Manjesh K
Format:	Article
Language:	English
Subjects:	Accuracy Algorithms Artificial neural networks Computation Inference Neural networks Source code
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Bajpai, Divya J Trivedi, Vivek K Yadav, Sohan L Hanawal, Manjesh K
description	Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To overcome the issue, various approaches are considered, like offloading part of the computation to the cloud for final inference (split computing) or performing the inference at an intermediary layer without passing through all layers (early exits). In this work, we propose combining both approaches by using early exits in split computing. In our approach, we decide up to what depth of DNNs computation to perform on the device (splitting layer) and whether a sample can exit from this layer or need to be offloaded. The decisions are based on a weighted combination of accuracy, computational, and communication costs. We develop an algorithm named SplitEE to learn an optimal policy. Since pre-trained DNNs are often deployed in new domains where the ground truths may be unavailable and samples arrive in a streaming fashion, SplitEE works in an online and unsupervised setup. We extensively perform experiments on five different datasets. SplitEE achieves a significant cost reduction ($>50\%$) with a slight drop in accuracy (\(
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2866249644</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2866249644</sourcerecordid><originalsourceid>FETCH-proquest_journals_28662496443</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwCC7IySxxdbVScE0syqlUcK3ILFHIzFNwSU0tUPBLLS1KzAFSJeX5RdnFCuWZJRkKYA0Kzvm5BaUlmXnpPAysaYk5xam8UJqbQdnNNcTZQ7egKL-wNLW4JD4rv7QoDygVb2RhZmZkYmlmYmJMnCoAvAo4SQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2866249644</pqid></control><display><type>article</type><title>SplitEE: Early Exit in Deep Neural Networks with Split Computing</title><source>Publicly Available Content Database</source><creator>Bajpai, Divya J ; Trivedi, Vivek K ; Yadav, Sohan L ; Hanawal, Manjesh K</creator><creatorcontrib>Bajpai, Divya J ; Trivedi, Vivek K ; Yadav, Sohan L ; Hanawal, Manjesh K</creatorcontrib><description>Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To overcome the issue, various approaches are considered, like offloading part of the computation to the cloud for final inference (split computing) or performing the inference at an intermediary layer without passing through all layers (early exits). In this work, we propose combining both approaches by using early exits in split computing. In our approach, we decide up to what depth of DNNs computation to perform on the device (splitting layer) and whether a sample can exit from this layer or need to be offloaded. The decisions are based on a weighted combination of accuracy, computational, and communication costs. We develop an algorithm named SplitEE to learn an optimal policy. Since pre-trained DNNs are often deployed in new domains where the ground truths may be unavailable and samples arrive in a streaming fashion, SplitEE works in an online and unsupervised setup. We extensively perform experiments on five different datasets. SplitEE achieves a significant cost reduction ($>50\%$) with a slight drop in accuracy ($<2\%$) as compared to the case when all samples are inferred at the final layer. The anonymized source code is available at \url{https://anonymous.4open.science/r/SplitEE_M-B989/README.md}.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Algorithms ; Artificial neural networks ; Computation ; Inference ; Neural networks ; Source code</subject><ispartof>arXiv.org, 2023-09</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2866249644?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>778,782,25736,36995,44573</link.rule.ids></links><search><creatorcontrib>Bajpai, Divya J</creatorcontrib><creatorcontrib>Trivedi, Vivek K</creatorcontrib><creatorcontrib>Yadav, Sohan L</creatorcontrib><creatorcontrib>Hanawal, Manjesh K</creatorcontrib><title>SplitEE: Early Exit in Deep Neural Networks with Split Computing</title><title>arXiv.org</title><description>Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To overcome the issue, various approaches are considered, like offloading part of the computation to the cloud for final inference (split computing) or performing the inference at an intermediary layer without passing through all layers (early exits). In this work, we propose combining both approaches by using early exits in split computing. In our approach, we decide up to what depth of DNNs computation to perform on the device (splitting layer) and whether a sample can exit from this layer or need to be offloaded. The decisions are based on a weighted combination of accuracy, computational, and communication costs. We develop an algorithm named SplitEE to learn an optimal policy. Since pre-trained DNNs are often deployed in new domains where the ground truths may be unavailable and samples arrive in a streaming fashion, SplitEE works in an online and unsupervised setup. We extensively perform experiments on five different datasets. SplitEE achieves a significant cost reduction ($>50\%$) with a slight drop in accuracy ($<2\%$) as compared to the case when all samples are inferred at the final layer. The anonymized source code is available at \url{https://anonymous.4open.science/r/SplitEE_M-B989/README.md}.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Computation</subject><subject>Inference</subject><subject>Neural networks</subject><subject>Source code</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRwCC7IySxxdbVScE0syqlUcK3ILFHIzFNwSU0tUPBLLS1KzAFSJeX5RdnFCuWZJRkKYA0Kzvm5BaUlmXnpPAysaYk5xam8UJqbQdnNNcTZQ7egKL-wNLW4JD4rv7QoDygVb2RhZmZkYmlmYmJMnCoAvAo4SQ</recordid><startdate>20230917</startdate><enddate>20230917</enddate><creator>Bajpai, Divya J</creator><creator>Trivedi, Vivek K</creator><creator>Yadav, Sohan L</creator><creator>Hanawal, Manjesh K</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope></search><sort><creationdate>20230917</creationdate><title>SplitEE: Early Exit in Deep Neural Networks with Split Computing</title><author>Bajpai, Divya J ; Trivedi, Vivek K ; Yadav, Sohan L ; Hanawal, Manjesh K</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28662496443</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Computation</topic><topic>Inference</topic><topic>Neural networks</topic><topic>Source code</topic><toplevel>online_resources</toplevel><creatorcontrib>Bajpai, Divya J</creatorcontrib><creatorcontrib>Trivedi, Vivek K</creatorcontrib><creatorcontrib>Yadav, Sohan L</creatorcontrib><creatorcontrib>Hanawal, Manjesh K</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bajpai, Divya J</au><au>Trivedi, Vivek K</au><au>Yadav, Sohan L</au><au>Hanawal, Manjesh K</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>SplitEE: Early Exit in Deep Neural Networks with Split Computing</atitle><jtitle>arXiv.org</jtitle><date>2023-09-17</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. However, deploying full-fledged DNNs in resource-constrained devices (edge, mobile, IoT) is difficult due to their large size. To overcome the issue, various approaches are considered, like offloading part of the computation to the cloud for final inference (split computing) or performing the inference at an intermediary layer without passing through all layers (early exits). In this work, we propose combining both approaches by using early exits in split computing. In our approach, we decide up to what depth of DNNs computation to perform on the device (splitting layer) and whether a sample can exit from this layer or need to be offloaded. The decisions are based on a weighted combination of accuracy, computational, and communication costs. We develop an algorithm named SplitEE to learn an optimal policy. Since pre-trained DNNs are often deployed in new domains where the ground truths may be unavailable and samples arrive in a streaming fashion, SplitEE works in an online and unsupervised setup. We extensively perform experiments on five different datasets. SplitEE achieves a significant cost reduction ($>50\%$) with a slight drop in accuracy ($<2\%$) as compared to the case when all samples are inferred at the final layer. The anonymized source code is available at \url{https://anonymous.4open.science/r/SplitEE_M-B989/README.md}.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-09
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2866249644
source	Publicly Available Content Database
subjects	Accuracy Algorithms Artificial neural networks Computation Inference Neural networks Source code
title	SplitEE: Early Exit in Deep Neural Networks with Split Computing
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T22%3A05%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=SplitEE:%20Early%20Exit%20in%20Deep%20Neural%20Networks%20with%20Split%20Computing&rft.jtitle=arXiv.org&rft.au=Bajpai,%20Divya%20J&rft.date=2023-09-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2866249644%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28662496443%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2866249644&rft_id=info:pmid/&rfr_iscdi=true