Loading…
Multihardware Adaptive Latency Prediction for Neural Architecture Search
In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed...
Saved in:
Published in: | IEEE internet of things journal 2025-02, Vol.12 (3), p.3385-3398 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763 |
container_end_page | 3398 |
container_issue | 3 |
container_start_page | 3385 |
container_title | IEEE internet of things journal |
container_volume | 12 |
creator | Lin, Chengmin Yang, Pengfei Wang, Quan Guo, Yitong Wang, Zhenyi |
description | In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures. |
doi_str_mv | 10.1109/JIOT.2024.3480990 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_3159503450</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10720229</ieee_id><sourcerecordid>3159503450</sourcerecordid><originalsourceid>FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763</originalsourceid><addsrcrecordid>eNpNkE1LAzEQhoMoWGp_gOBhwfPWycduNsdS1FaqFaznkM1OaErt1mxW6b83pR46MEwC7zMDDyG3FMaUgnp4mS9XYwZMjLmoQCm4IAPGmcxFWbLLs_c1GXXdBgASVlBVDsjstd9Gvzah-TUBs0lj9tH_YLYwEXf2kL0HbLyNvt1lrg3ZG_bBbLNJsGsf0cY-MR9o0veGXDmz7XD0P4fk8-lxNZ3li-XzfDpZ5JbKMuZOCdOouqwkBaqkrYyzlFasEaawXGCtam5SOSkcbSzUorZgQJRCSQRZ8iG5P-3dh_a7xy7qTduHXTqpOS1UAVykHhJ6StnQdl1Ap_fBf5lw0BT00Zk-OtNHZ_rfWWLuToxHxLO8TCmm-B8mXWgC</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3159503450</pqid></control><display><type>article</type><title>Multihardware Adaptive Latency Prediction for Neural Architecture Search</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Lin, Chengmin ; Yang, Pengfei ; Wang, Quan ; Guo, Yitong ; Wang, Zhenyi</creator><creatorcontrib>Lin, Chengmin ; Yang, Pengfei ; Wang, Quan ; Guo, Yitong ; Wang, Zhenyi</creatorcontrib><description>In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.</description><identifier>ISSN: 2327-4662</identifier><identifier>EISSN: 2327-4662</identifier><identifier>DOI: 10.1109/JIOT.2024.3480990</identifier><identifier>CODEN: IITJAU</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Accuracy ; Adaptation models ; Adaptive sampling ; Computer architecture ; Data models ; Dynamic sample allocation ; few-shot learning ; Hardware ; hardware-aware ; latency predictor ; Network architecture ; Network latency ; neural architecture search (NAS) ; Optimization ; Parameter identification ; Performance evaluation ; Platforms ; Predictive models ; representative sample sampling ; Sample size ; Search process ; Training</subject><ispartof>IEEE internet of things journal, 2025-02, Vol.12 (3), p.3385-3398</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763</cites><orcidid>0000-0002-2138-3739 ; 0009-0007-3460-2850 ; 0000-0002-3121-6299 ; 0000-0001-6913-8604 ; 0000-0003-4065-4052</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10720229$$EHTML$$P50$$Gieee$$H</linktohtml></links><search><creatorcontrib>Lin, Chengmin</creatorcontrib><creatorcontrib>Yang, Pengfei</creatorcontrib><creatorcontrib>Wang, Quan</creatorcontrib><creatorcontrib>Guo, Yitong</creatorcontrib><creatorcontrib>Wang, Zhenyi</creatorcontrib><title>Multihardware Adaptive Latency Prediction for Neural Architecture Search</title><title>IEEE internet of things journal</title><addtitle>JIoT</addtitle><description>In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.</description><subject>Accuracy</subject><subject>Adaptation models</subject><subject>Adaptive sampling</subject><subject>Computer architecture</subject><subject>Data models</subject><subject>Dynamic sample allocation</subject><subject>few-shot learning</subject><subject>Hardware</subject><subject>hardware-aware</subject><subject>latency predictor</subject><subject>Network architecture</subject><subject>Network latency</subject><subject>neural architecture search (NAS)</subject><subject>Optimization</subject><subject>Parameter identification</subject><subject>Performance evaluation</subject><subject>Platforms</subject><subject>Predictive models</subject><subject>representative sample sampling</subject><subject>Sample size</subject><subject>Search process</subject><subject>Training</subject><issn>2327-4662</issn><issn>2327-4662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNpNkE1LAzEQhoMoWGp_gOBhwfPWycduNsdS1FaqFaznkM1OaErt1mxW6b83pR46MEwC7zMDDyG3FMaUgnp4mS9XYwZMjLmoQCm4IAPGmcxFWbLLs_c1GXXdBgASVlBVDsjstd9Gvzah-TUBs0lj9tH_YLYwEXf2kL0HbLyNvt1lrg3ZG_bBbLNJsGsf0cY-MR9o0veGXDmz7XD0P4fk8-lxNZ3li-XzfDpZ5JbKMuZOCdOouqwkBaqkrYyzlFasEaawXGCtam5SOSkcbSzUorZgQJRCSQRZ8iG5P-3dh_a7xy7qTduHXTqpOS1UAVykHhJ6StnQdl1Ap_fBf5lw0BT00Zk-OtNHZ_rfWWLuToxHxLO8TCmm-B8mXWgC</recordid><startdate>20250201</startdate><enddate>20250201</enddate><creator>Lin, Chengmin</creator><creator>Yang, Pengfei</creator><creator>Wang, Quan</creator><creator>Guo, Yitong</creator><creator>Wang, Zhenyi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2138-3739</orcidid><orcidid>https://orcid.org/0009-0007-3460-2850</orcidid><orcidid>https://orcid.org/0000-0002-3121-6299</orcidid><orcidid>https://orcid.org/0000-0001-6913-8604</orcidid><orcidid>https://orcid.org/0000-0003-4065-4052</orcidid></search><sort><creationdate>20250201</creationdate><title>Multihardware Adaptive Latency Prediction for Neural Architecture Search</title><author>Lin, Chengmin ; Yang, Pengfei ; Wang, Quan ; Guo, Yitong ; Wang, Zhenyi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Accuracy</topic><topic>Adaptation models</topic><topic>Adaptive sampling</topic><topic>Computer architecture</topic><topic>Data models</topic><topic>Dynamic sample allocation</topic><topic>few-shot learning</topic><topic>Hardware</topic><topic>hardware-aware</topic><topic>latency predictor</topic><topic>Network architecture</topic><topic>Network latency</topic><topic>neural architecture search (NAS)</topic><topic>Optimization</topic><topic>Parameter identification</topic><topic>Performance evaluation</topic><topic>Platforms</topic><topic>Predictive models</topic><topic>representative sample sampling</topic><topic>Sample size</topic><topic>Search process</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Lin, Chengmin</creatorcontrib><creatorcontrib>Yang, Pengfei</creatorcontrib><creatorcontrib>Wang, Quan</creatorcontrib><creatorcontrib>Guo, Yitong</creatorcontrib><creatorcontrib>Wang, Zhenyi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE/IET Electronic Library (IEL) (UW System Shared)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE internet of things journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Chengmin</au><au>Yang, Pengfei</au><au>Wang, Quan</au><au>Guo, Yitong</au><au>Wang, Zhenyi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multihardware Adaptive Latency Prediction for Neural Architecture Search</atitle><jtitle>IEEE internet of things journal</jtitle><stitle>JIoT</stitle><date>2025-02-01</date><risdate>2025</risdate><volume>12</volume><issue>3</issue><spage>3385</spage><epage>3398</epage><pages>3385-3398</pages><issn>2327-4662</issn><eissn>2327-4662</eissn><coden>IITJAU</coden><abstract>In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/JIOT.2024.3480990</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-2138-3739</orcidid><orcidid>https://orcid.org/0009-0007-3460-2850</orcidid><orcidid>https://orcid.org/0000-0002-3121-6299</orcidid><orcidid>https://orcid.org/0000-0001-6913-8604</orcidid><orcidid>https://orcid.org/0000-0003-4065-4052</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2327-4662 |
ispartof | IEEE internet of things journal, 2025-02, Vol.12 (3), p.3385-3398 |
issn | 2327-4662 2327-4662 |
language | eng |
recordid | cdi_proquest_journals_3159503450 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Accuracy Adaptation models Adaptive sampling Computer architecture Data models Dynamic sample allocation few-shot learning Hardware hardware-aware latency predictor Network architecture Network latency neural architecture search (NAS) Optimization Parameter identification Performance evaluation Platforms Predictive models representative sample sampling Sample size Search process Training |
title | Multihardware Adaptive Latency Prediction for Neural Architecture Search |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-03-09T15%3A47%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multihardware%20Adaptive%20Latency%20Prediction%20for%20Neural%20Architecture%20Search&rft.jtitle=IEEE%20internet%20of%20things%20journal&rft.au=Lin,%20Chengmin&rft.date=2025-02-01&rft.volume=12&rft.issue=3&rft.spage=3385&rft.epage=3398&rft.pages=3385-3398&rft.issn=2327-4662&rft.eissn=2327-4662&rft.coden=IITJAU&rft_id=info:doi/10.1109/JIOT.2024.3480990&rft_dat=%3Cproquest_ieee_%3E3159503450%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c176t-f94ad9b68710197c8afc1182d4a5c34eb9b3aaaaf74f1dc0b4bc0a046497e0763%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3159503450&rft_id=info:pmid/&rft_ieee_id=10720229&rfr_iscdi=true |