Loading…

Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)

In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewa...

Full description

Saved in:
Bibliographic Details
Main Authors: Kim, Keunwoo, Park, Hyunwook, Son, Keeyoung, Choi, Seonguk, Shin, Taein, Lee, Junghyun, Yoon, Jiwon, An, Hyunjun, Kim, Haeyeon, Choi, Wooshin, Choi, Jung-Hwan, Kim, Joungho
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 3
container_issue
container_start_page 1
container_title
container_volume
creator Kim, Keunwoo
Park, Hyunwook
Son, Keeyoung
Choi, Seonguk
Shin, Taein
Lee, Junghyun
Yoon, Jiwon
An, Hyunjun
Kim, Haeyeon
Choi, Wooshin
Choi, Jung-Hwan
Kim, Joungho
description In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.
doi_str_mv 10.1109/EPEPS61853.2024.10754045
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10754045</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10754045</ieee_id><sourcerecordid>10754045</sourcerecordid><originalsourceid>FETCH-ieee_primary_107540453</originalsourceid><addsrcrecordid>eNqFj7FOwzAURQ0SEhXNHzC8sR0S_OI4ddZAUIZWRC0DW_VoXltXiRM5kaB8PR3KzHSHc85whQCUEaLMnoqqqDYpGq2iWMZJhHKhE5noGxFki8woLZXGWOGtmMSY6jBB1PciGIaTlFKhMSZLJ-JUfPcNWUefDcOardt3fsctuxGWTN5Zd5h9rJfzMKeBa3jhHfVQNXR13vrRtvaHRts5uKRQ2sMRcnL1l63HI6y47fwZZmW-mk_F3Z6agYPrPojH1-L9uQwtM297b1vy5-3fDfUP_gV4fktM</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</title><source>IEEE Xplore All Conference Series</source><creator>Kim, Keunwoo ; Park, Hyunwook ; Son, Keeyoung ; Choi, Seonguk ; Shin, Taein ; Lee, Junghyun ; Yoon, Jiwon ; An, Hyunjun ; Kim, Haeyeon ; Choi, Wooshin ; Choi, Jung-Hwan ; Kim, Joungho</creator><creatorcontrib>Kim, Keunwoo ; Park, Hyunwook ; Son, Keeyoung ; Choi, Seonguk ; Shin, Taein ; Lee, Junghyun ; Yoon, Jiwon ; An, Hyunjun ; Kim, Haeyeon ; Choi, Wooshin ; Choi, Jung-Hwan ; Kim, Joungho</creatorcontrib><description>In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.</description><identifier>EISSN: 2165-4115</identifier><identifier>EISBN: 9798350351231</identifier><identifier>DOI: 10.1109/EPEPS61853.2024.10754045</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bandwidth ; Electronics packaging ; explainable reinforcement learning (XRL) ; high-bandwidth memory (HBM) ; Optimization methods ; Power integrity (PI) ; Reinforcement learning ; Training ; Vectors</subject><ispartof>IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print), 2024, p.1-3</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10754045$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,777,781,786,787,27906,54536,54913</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10754045$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kim, Keunwoo</creatorcontrib><creatorcontrib>Park, Hyunwook</creatorcontrib><creatorcontrib>Son, Keeyoung</creatorcontrib><creatorcontrib>Choi, Seonguk</creatorcontrib><creatorcontrib>Shin, Taein</creatorcontrib><creatorcontrib>Lee, Junghyun</creatorcontrib><creatorcontrib>Yoon, Jiwon</creatorcontrib><creatorcontrib>An, Hyunjun</creatorcontrib><creatorcontrib>Kim, Haeyeon</creatorcontrib><creatorcontrib>Choi, Wooshin</creatorcontrib><creatorcontrib>Choi, Jung-Hwan</creatorcontrib><creatorcontrib>Kim, Joungho</creatorcontrib><title>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</title><title>IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print)</title><addtitle>EPEPS</addtitle><description>In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.</description><subject>Bandwidth</subject><subject>Electronics packaging</subject><subject>explainable reinforcement learning (XRL)</subject><subject>high-bandwidth memory (HBM)</subject><subject>Optimization methods</subject><subject>Power integrity (PI)</subject><subject>Reinforcement learning</subject><subject>Training</subject><subject>Vectors</subject><issn>2165-4115</issn><isbn>9798350351231</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNqFj7FOwzAURQ0SEhXNHzC8sR0S_OI4ddZAUIZWRC0DW_VoXltXiRM5kaB8PR3KzHSHc85whQCUEaLMnoqqqDYpGq2iWMZJhHKhE5noGxFki8woLZXGWOGtmMSY6jBB1PciGIaTlFKhMSZLJ-JUfPcNWUefDcOardt3fsctuxGWTN5Zd5h9rJfzMKeBa3jhHfVQNXR13vrRtvaHRts5uKRQ2sMRcnL1l63HI6y47fwZZmW-mk_F3Z6agYPrPojH1-L9uQwtM297b1vy5-3fDfUP_gV4fktM</recordid><startdate>20241006</startdate><enddate>20241006</enddate><creator>Kim, Keunwoo</creator><creator>Park, Hyunwook</creator><creator>Son, Keeyoung</creator><creator>Choi, Seonguk</creator><creator>Shin, Taein</creator><creator>Lee, Junghyun</creator><creator>Yoon, Jiwon</creator><creator>An, Hyunjun</creator><creator>Kim, Haeyeon</creator><creator>Choi, Wooshin</creator><creator>Choi, Jung-Hwan</creator><creator>Kim, Joungho</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20241006</creationdate><title>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</title><author>Kim, Keunwoo ; Park, Hyunwook ; Son, Keeyoung ; Choi, Seonguk ; Shin, Taein ; Lee, Junghyun ; Yoon, Jiwon ; An, Hyunjun ; Kim, Haeyeon ; Choi, Wooshin ; Choi, Jung-Hwan ; Kim, Joungho</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_107540453</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Bandwidth</topic><topic>Electronics packaging</topic><topic>explainable reinforcement learning (XRL)</topic><topic>high-bandwidth memory (HBM)</topic><topic>Optimization methods</topic><topic>Power integrity (PI)</topic><topic>Reinforcement learning</topic><topic>Training</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Kim, Keunwoo</creatorcontrib><creatorcontrib>Park, Hyunwook</creatorcontrib><creatorcontrib>Son, Keeyoung</creatorcontrib><creatorcontrib>Choi, Seonguk</creatorcontrib><creatorcontrib>Shin, Taein</creatorcontrib><creatorcontrib>Lee, Junghyun</creatorcontrib><creatorcontrib>Yoon, Jiwon</creatorcontrib><creatorcontrib>An, Hyunjun</creatorcontrib><creatorcontrib>Kim, Haeyeon</creatorcontrib><creatorcontrib>Choi, Wooshin</creatorcontrib><creatorcontrib>Choi, Jung-Hwan</creatorcontrib><creatorcontrib>Kim, Joungho</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Keunwoo</au><au>Park, Hyunwook</au><au>Son, Keeyoung</au><au>Choi, Seonguk</au><au>Shin, Taein</au><au>Lee, Junghyun</au><au>Yoon, Jiwon</au><au>An, Hyunjun</au><au>Kim, Haeyeon</au><au>Choi, Wooshin</au><au>Choi, Jung-Hwan</au><au>Kim, Joungho</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)</atitle><btitle>IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print)</btitle><stitle>EPEPS</stitle><date>2024-10-06</date><risdate>2024</risdate><spage>1</spage><epage>3</epage><pages>1-3</pages><eissn>2165-4115</eissn><eisbn>9798350351231</eisbn><abstract>In this paper, for the first time, we propose an explainable reinforcement learning (XRL)-based decap placement optimization method for high bandwidth memory (HBM) considering power integrity (PI). The proposed XRL-based method enhances explainability by transforming the sum of various types of rewards into a vector sum operation for the trained model. A CNN-based network was used for training, with each reward considered from a multi-objective RL perspective. To verify the proposed method, we applied it to solve the problem of placing decaps at VDDQ domain of HBM3 module. In this paper, rewards were set as the suppression of self-impedance and transfer-impedance at each probing port. The proposed method achieved improvements of 2.8% compared to usage of general scalar sum reward. Ultimately, the vector differences in the Q-value for different actions provided grounds for action taken and allowed for the evaluation of whether the model was well-trained.</abstract><pub>IEEE</pub><doi>10.1109/EPEPS61853.2024.10754045</doi></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2165-4115
ispartof IEEE ... Conference on Electrical Performance of Electronic Packaging and Systems (Print), 2024, p.1-3
issn 2165-4115
language eng
recordid cdi_ieee_primary_10754045
source IEEE Xplore All Conference Series
subjects Bandwidth
Electronics packaging
explainable reinforcement learning (XRL)
high-bandwidth memory (HBM)
Optimization methods
Power integrity (PI)
Reinforcement learning
Training
Vectors
title Explainable Reinforcement Learning(XRL)-Based Decap Placement Optimization for High Bandwidth Memory (HBM)
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T20%3A27%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Explainable%20Reinforcement%20Learning(XRL)-Based%20Decap%20Placement%20Optimization%20for%20High%20Bandwidth%20Memory%20(HBM)&rft.btitle=IEEE%20...%20Conference%20on%20Electrical%20Performance%20of%20Electronic%20Packaging%20and%20Systems%20(Print)&rft.au=Kim,%20Keunwoo&rft.date=2024-10-06&rft.spage=1&rft.epage=3&rft.pages=1-3&rft.eissn=2165-4115&rft_id=info:doi/10.1109/EPEPS61853.2024.10754045&rft.eisbn=9798350351231&rft_dat=%3Cieee_CHZPO%3E10754045%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-ieee_primary_107540453%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10754045&rfr_iscdi=true