Loading…

Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)

In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewa...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kim, Keunwoo, Park, Hyunwook, Son, Keeyoung, Choi, Seonguk, Shin, Taein, Lee, Junghyun, Yoon, Jiwon, An, Hyunjun, Kim, Haeyeon, Choi, Wooshin, Choi, Jung-Hwan, Kim, Joungho
Format:	Conference Proceeding
Language:	English
Subjects:	Bandwidth Electronics packaging explainable reinforcement learning (XRL) high-bandwidth memory (HBM) Optimization methods Power integrity (PI) Reinforcement learning Training Vectors
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	3
container_issue
container_start_page	1
container_title
container_volume
creator	Kim, Keunwoo Park, Hyunwook Son, Keeyoung Choi, Seonguk Shin, Taein Lee, Junghyun Yoon, Jiwon An, Hyunjun Kim, Haeyeon Choi, Wooshin Choi, Jung-Hwan Kim, Joungho
description	In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.
doi_str_mv	10.1109/EPEPS61853.2024.10754045
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10754045</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10754045</ieee_id><sourcerecordid>10754045</sourcerecordid><originalsourceid>FETCH-ieee_primary_107540453</originalsourceid><addsrcrecordid>eNqFj7FOwzAURQ0SEhXNHzC8sR0S_OI4ddZAUIZWRC0DW_VoXltXiRM5kaB8PR3KzHSHc85whQCUEaLMnoqqqDYpGq2iWMZJhHKhE5noGxFki8woLZXGWOGtmMSY6jBB1PciGIaTlFKhMSZLJ-JUfPcNWUefDcOardt3fsctuxGWTN5Zd5h9rJfzMKeBa3jhHfVQNXR13vrRtvaHRts5uKRQ2sMRcnL1l63HI6y47fwZZmW-mk_F3Z6agYPrPojH1-L9uQwtM297b1vy5-3fDfUP_gV4fktM</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</title><source>IEEE Xplore All Conference Series</source><creator>Kim, Keunwoo ; Park, Hyunwook ; Son, Keeyoung ; Choi, Seonguk ; Shin, Taein ; Lee, Junghyun ; Yoon, Jiwon ; An, Hyunjun ; Kim, Haeyeon ; Choi, Wooshin ; Choi, Jung-Hwan ; Kim, Joungho</creator><creatorcontrib>Kim, Keunwoo ; Park, Hyunwook ; Son, Keeyoung ; Choi, Seonguk ; Shin, Taein ; Lee, Junghyun ; Yoon, Jiwon ; An, Hyunjun ; Kim, Haeyeon ; Choi, Wooshin ; Choi, Jung-Hwan ; Kim, Joungho</creatorcontrib><description>In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.</description><identifier>EISSN: 2165-4115</identifier><identifier>EISBN: 9798350351231</identifier><identifier>DOI: 10.1109/EPEPS61853.2024.10754045</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bandwidth ; Electronics packaging ; explainable reinforcement learning (XRL) ; high-bandwidth memory (HBM) ; Optimization methods ; Power integrity (PI) ; Reinforcement learning ; Training ; Vectors</subject><ispartof>IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print), 2024, p.1-3</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10754045$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,777,781,786,787,27906,54536,54913</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10754045$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kim, Keunwoo</creatorcontrib><creatorcontrib>Park, Hyunwook</creatorcontrib><creatorcontrib>Son, Keeyoung</creatorcontrib><creatorcontrib>Choi, Seonguk</creatorcontrib><creatorcontrib>Shin, Taein</creatorcontrib><creatorcontrib>Lee, Junghyun</creatorcontrib><creatorcontrib>Yoon, Jiwon</creatorcontrib><creatorcontrib>An, Hyunjun</creatorcontrib><creatorcontrib>Kim, Haeyeon</creatorcontrib><creatorcontrib>Choi, Wooshin</creatorcontrib><creatorcontrib>Choi, Jung-Hwan</creatorcontrib><creatorcontrib>Kim, Joungho</creatorcontrib><title>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</title><title>IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print)</title><addtitle>EPEPS</addtitle><description>In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.</description><subject>Bandwidth</subject><subject>Electronics packaging</subject><subject>explainable reinforcement learning (XRL)</subject><subject>high-bandwidth memory (HBM)</subject><subject>Optimization methods</subject><subject>Power integrity (PI)</subject><subject>Reinforcement learning</subject><subject>Training</subject><subject>Vectors</subject><issn>2165-4115</issn><isbn>9798350351231</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNqFj7FOwzAURQ0SEhXNHzC8sR0S_OI4ddZAUIZWRC0DW_VoXltXiRM5kaB8PR3KzHSHc85whQCUEaLMnoqqqDYpGq2iWMZJhHKhE5noGxFki8woLZXGWOGtmMSY6jBB1PciGIaTlFKhMSZLJ-JUfPcNWUefDcOardt3fsctuxGWTN5Zd5h9rJfzMKeBa3jhHfVQNXR13vrRtvaHRts5uKRQ2sMRcnL1l63HI6y47fwZZmW-mk_F3Z6agYPrPojH1-L9uQwtM297b1vy5-3fDfUP_gV4fktM</recordid><startdate>20241006</startdate><enddate>20241006</enddate><creator>Kim, Keunwoo</creator><creator>Park, Hyunwook</creator><creator>Son, Keeyoung</creator><creator>Choi, Seonguk</creator><creator>Shin, Taein</creator><creator>Lee, Junghyun</creator><creator>Yoon, Jiwon</creator><creator>An, Hyunjun</creator><creator>Kim, Haeyeon</creator><creator>Choi, Wooshin</creator><creator>Choi, Jung-Hwan</creator><creator>Kim, Joungho</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20241006</creationdate><title>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</title><author>Kim, Keunwoo ; Park, Hyunwook ; Son, Keeyoung ; Choi, Seonguk ; Shin, Taein ; Lee, Junghyun ; Yoon, Jiwon ; An, Hyunjun ; Kim, Haeyeon ; Choi, Wooshin ; Choi, Jung-Hwan ; Kim, Joungho</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_107540453</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Bandwidth</topic><topic>Electronics packaging</topic><topic>explainable reinforcement learning (XRL)</topic><topic>high-bandwidth memory (HBM)</topic><topic>Optimization methods</topic><topic>Power integrity (PI)</topic><topic>Reinforcement learning</topic><topic>Training</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Kim, Keunwoo</creatorcontrib><creatorcontrib>Park, Hyunwook</creatorcontrib><creatorcontrib>Son, Keeyoung</creatorcontrib><creatorcontrib>Choi, Seonguk</creatorcontrib><creatorcontrib>Shin, Taein</creatorcontrib><creatorcontrib>Lee, Junghyun</creatorcontrib><creatorcontrib>Yoon, Jiwon</creatorcontrib><creatorcontrib>An, Hyunjun</creatorcontrib><creatorcontrib>Kim, Haeyeon</creatorcontrib><creatorcontrib>Choi, Wooshin</creatorcontrib><creatorcontrib>Choi, Jung-Hwan</creatorcontrib><creatorcontrib>Kim, Joungho</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Keunwoo</au><au>Park, Hyunwook</au><au>Son, Keeyoung</au><au>Choi, Seonguk</au><au>Shin, Taein</au><au>Lee, Junghyun</au><au>Yoon, Jiwon</au><au>An, Hyunjun</au><au>Kim, Haeyeon</au><au>Choi, Wooshin</au><au>Choi, Jung-Hwan</au><au>Kim, Joungho</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</atitle><btitle>IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print)</btitle><stitle>EPEPS</stitle><date>2024-10-06</date><risdate>2024</risdate><spage>1</spage><epage>3</epage><pages>1-3</pages><eissn>2165-4115</eissn><eisbn>9798350351231</eisbn><abstract>In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.</abstract><pub>IEEE</pub><doi>10.1109/EPEPS61853.2024.10754045</doi></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2165-4115
ispartof	IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print), 2024, p.1-3
issn	2165-4115
language	eng
recordid	cdi_ieee_primary_10754045
source	IEEE Xplore All Conference Series
subjects	Bandwidth Electronics packaging explainable reinforcement learning (XRL) high-bandwidth memory (HBM) Optimization methods Power integrity (PI) Reinforcement learning Training Vectors
title	Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T20%3A27%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Explainable%20Reinforcement%20Learning(XRL)-Based%20Decap%20Placement%20Optimization%20for%20High%20Bandwidth%20Memory%20(HBM)&rft.btitle=IEEE%20...%20Conference%20on%20Electrical%20Performance%20of%20Electronic%20Packaging%20and%20Systems%20(Print)&rft.au=Kim,%20Keunwoo&rft.date=2024-10-06&rft.spage=1&rft.epage=3&rft.pages=1-3&rft.eissn=2165-4115&rft_id=info:doi/10.1109/EPEPS61853.2024.10754045&rft.eisbn=9798350351231&rft_dat=%3Cieee_CHZPO%3E10754045%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-ieee_primary_107540453%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10754045&rfr_iscdi=true