Loading…

Multihardware Adaptive Latency Prediction for Neural Architecture Search

In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE internet of things journal 2025-02, Vol.12 (3), p.3385-3398
Main Authors:	Lin, Chengmin, Yang, Pengfei, Wang, Quan, Guo, Yitong, Wang, Zhenyi
Format:	Article
Language:	English
Subjects:	Accuracy Adaptation models Adaptive sampling Computer architecture Data models Dynamic sample allocation few-shot learning Hardware hardware-aware latency predictor Network architecture Network latency neural architecture search (NAS) Optimization Parameter identification Performance evaluation Platforms Predictive models representative sample sampling Sample size Search process Training
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763
container_end_page	3398
container_issue	3
container_start_page	3385
container_title	IEEE internet of things journal
container_volume	12
creator	Lin, Chengmin Yang, Pengfei Wang, Quan Guo, Yitong Wang, Zhenyi
description	In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.
doi_str_mv	10.1109/JIOT.2024.3480990
format	article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_3159503450</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10720229</ieee_id><sourcerecordid>3159503450</sourcerecordid><originalsourceid>FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763</originalsourceid><addsrcrecordid>eNpNkE1LAzEQhoMoWGp_gOBhwfPWycduNsdS1FaqFaznkM1OaErt1mxW6b83pR46MEwC7zMDDyG3FMaUgnp4mS9XYwZMjLmoQCm4IAPGmcxFWbLLs_c1GXXdBgASVlBVDsjstd9Gvzah-TUBs0lj9tH_YLYwEXf2kL0HbLyNvt1lrg3ZG_bBbLNJsGsf0cY-MR9o0veGXDmz7XD0P4fk8-lxNZ3li-XzfDpZ5JbKMuZOCdOouqwkBaqkrYyzlFasEaawXGCtam5SOSkcbSzUorZgQJRCSQRZ8iG5P-3dh_a7xy7qTduHXTqpOS1UAVykHhJ6StnQdl1Ap_fBf5lw0BT00Zk-OtNHZ_rfWWLuToxHxLO8TCmm-B8mXWgC</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3159503450</pqid></control><display><type>article</type><title>Multihardware Adaptive Latency Prediction for Neural Architecture Search</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Lin, Chengmin ; Yang, Pengfei ; Wang, Quan ; Guo, Yitong ; Wang, Zhenyi</creator><creatorcontrib>Lin, Chengmin ; Yang, Pengfei ; Wang, Quan ; Guo, Yitong ; Wang, Zhenyi</creatorcontrib><description>In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.</description><identifier>ISSN: 2327-4662</identifier><identifier>EISSN: 2327-4662</identifier><identifier>DOI: 10.1109/JIOT.2024.3480990</identifier><identifier>CODEN: IITJAU</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Accuracy ; Adaptation models ; Adaptive sampling ; Computer architecture ; Data models ; Dynamic sample allocation ; few-shot learning ; Hardware ; hardware-aware ; latency predictor ; Network architecture ; Network latency ; neural architecture search (NAS) ; Optimization ; Parameter identification ; Performance evaluation ; Platforms ; Predictive models ; representative sample sampling ; Sample size ; Search process ; Training</subject><ispartof>IEEE internet of things journal, 2025-02, Vol.12 (3), p.3385-3398</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763</cites><orcidid>0000-0002-2138-3739 ; 0009-0007-3460-2850 ; 0000-0002-3121-6299 ; 0000-0001-6913-8604 ; 0000-0003-4065-4052</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10720229$$EHTML$$P50$$Gieee$$H</linktohtml></links><search><creatorcontrib>Lin, Chengmin</creatorcontrib><creatorcontrib>Yang, Pengfei</creatorcontrib><creatorcontrib>Wang, Quan</creatorcontrib><creatorcontrib>Guo, Yitong</creatorcontrib><creatorcontrib>Wang, Zhenyi</creatorcontrib><title>Multihardware Adaptive Latency Prediction for Neural Architecture Search</title><title>IEEE internet of things journal</title><addtitle>JIoT</addtitle><description>In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.</description><subject>Accuracy</subject><subject>Adaptation models</subject><subject>Adaptive sampling</subject><subject>Computer architecture</subject><subject>Data models</subject><subject>Dynamic sample allocation</subject><subject>few-shot learning</subject><subject>Hardware</subject><subject>hardware-aware</subject><subject>latency predictor</subject><subject>Network architecture</subject><subject>Network latency</subject><subject>neural architecture search (NAS)</subject><subject>Optimization</subject><subject>Parameter identification</subject><subject>Performance evaluation</subject><subject>Platforms</subject><subject>Predictive models</subject><subject>representative sample sampling</subject><subject>Sample size</subject><subject>Search process</subject><subject>Training</subject><issn>2327-4662</issn><issn>2327-4662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNpNkE1LAzEQhoMoWGp_gOBhwfPWycduNsdS1FaqFaznkM1OaErt1mxW6b83pR46MEwC7zMDDyG3FMaUgnp4mS9XYwZMjLmoQCm4IAPGmcxFWbLLs_c1GXXdBgASVlBVDsjstd9Gvzah-TUBs0lj9tH_YLYwEXf2kL0HbLyNvt1lrg3ZG_bBbLNJsGsf0cY-MR9o0veGXDmz7XD0P4fk8-lxNZ3li-XzfDpZ5JbKMuZOCdOouqwkBaqkrYyzlFasEaawXGCtam5SOSkcbSzUorZgQJRCSQRZ8iG5P-3dh_a7xy7qTduHXTqpOS1UAVykHhJ6StnQdl1Ap_fBf5lw0BT00Zk-OtNHZ_rfWWLuToxHxLO8TCmm-B8mXWgC</recordid><startdate>20250201</startdate><enddate>20250201</enddate><creator>Lin, Chengmin</creator><creator>Yang, Pengfei</creator><creator>Wang, Quan</creator><creator>Guo, Yitong</creator><creator>Wang, Zhenyi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2138-3739</orcidid><orcidid>https://orcid.org/0009-0007-3460-2850</orcidid><orcidid>https://orcid.org/0000-0002-3121-6299</orcidid><orcidid>https://orcid.org/0000-0001-6913-8604</orcidid><orcidid>https://orcid.org/0000-0003-4065-4052</orcidid></search><sort><creationdate>20250201</creationdate><title>Multihardware Adaptive Latency Prediction for Neural Architecture Search</title><author>Lin, Chengmin ; Yang, Pengfei ; Wang, Quan ; Guo, Yitong ; Wang, Zhenyi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Accuracy</topic><topic>Adaptation models</topic><topic>Adaptive sampling</topic><topic>Computer architecture</topic><topic>Data models</topic><topic>Dynamic sample allocation</topic><topic>few-shot learning</topic><topic>Hardware</topic><topic>hardware-aware</topic><topic>latency predictor</topic><topic>Network architecture</topic><topic>Network latency</topic><topic>neural architecture search (NAS)</topic><topic>Optimization</topic><topic>Parameter identification</topic><topic>Performance evaluation</topic><topic>Platforms</topic><topic>Predictive models</topic><topic>representative sample sampling</topic><topic>Sample size</topic><topic>Search process</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Lin, Chengmin</creatorcontrib><creatorcontrib>Yang, Pengfei</creatorcontrib><creatorcontrib>Wang, Quan</creatorcontrib><creatorcontrib>Guo, Yitong</creatorcontrib><creatorcontrib>Wang, Zhenyi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE/IET Electronic Library (IEL) (UW System Shared)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE internet of things journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Chengmin</au><au>Yang, Pengfei</au><au>Wang, Quan</au><au>Guo, Yitong</au><au>Wang, Zhenyi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multihardware Adaptive Latency Prediction for Neural Architecture Search</atitle><jtitle>IEEE internet of things journal</jtitle><stitle>JIoT</stitle><date>2025-02-01</date><risdate>2025</risdate><volume>12</volume><issue>3</issue><spage>3385</spage><epage>3398</epage><pages>3385-3398</pages><issn>2327-4662</issn><eissn>2327-4662</eissn><coden>IITJAU</coden><abstract>In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/JIOT.2024.3480990</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-2138-3739</orcidid><orcidid>https://orcid.org/0009-0007-3460-2850</orcidid><orcidid>https://orcid.org/0000-0002-3121-6299</orcidid><orcidid>https://orcid.org/0000-0001-6913-8604</orcidid><orcidid>https://orcid.org/0000-0003-4065-4052</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 2327-4662
ispartof	IEEE internet of things journal, 2025-02, Vol.12 (3), p.3385-3398
issn	2327-4662 2327-4662
language	eng
recordid	cdi_proquest_journals_3159503450
source	IEEE Electronic Library (IEL) Journals
subjects	Accuracy Adaptation models Adaptive sampling Computer architecture Data models Dynamic sample allocation few-shot learning Hardware hardware-aware latency predictor Network architecture Network latency neural architecture search (NAS) Optimization Parameter identification Performance evaluation Platforms Predictive models representative sample sampling Sample size Search process Training
title	Multihardware Adaptive Latency Prediction for Neural Architecture Search
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-03-09T15%3A47%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multihardware%20Adaptive%20Latency%20Prediction%20for%20Neural%20Architecture%20Search&rft.jtitle=IEEE%20internet%20of%20things%20journal&rft.au=Lin,%20Chengmin&rft.date=2025-02-01&rft.volume=12&rft.issue=3&rft.spage=3385&rft.epage=3398&rft.pages=3385-3398&rft.issn=2327-4662&rft.eissn=2327-4662&rft.coden=IITJAU&rft_id=info:doi/10.1109/JIOT.2024.3480990&rft_dat=%3Cproquest_ieee_%3E3159503450%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3159503450&rft_id=info:pmid/&rft_ieee_id=10720229&rfr_iscdi=true