Loading…

Transformer Network-Based Reinforcement Learning Method for Power Distribution Network (PDN) Optimization of High Bandwidth Memory (HBM)

In this article, for the first time, we propose a transformer network-based reinforcement learning (RL) method for power distribution network (PDN) optimization of high bandwidth memory (HBM). The proposed method can provide an optimal decoupling capacitor (decap) design to maximize the reduction of...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on microwave theory and techniques 2022-11, Vol.70 (11), p.4772-4786
Main Authors:	Park, Hyunwook, Kim, Minsu, Kim, Seongguk, Kim, Keunwoo, Kim, Haeyeon, Shin, Taein, Son, Keeyoung, Sim, Boogyo, Kim, Subin, Jeong, Seungtaek, Hwang, Chulsoon, Kim, Joungho
Format:	Article
Language:	English
Subjects:	Bandwidth Combinatorial analysis Computational efficiency Computing time Context Decoupling Decoupling capacitor (decap) Electric power distribution Embedding Genetic algorithms high bandwidth memory (HBM) Impedance Learning Optimization power distribution network (PDN) reinforcement learning (RL) Scalability Seaports Training transformer network Transformers
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In this article, for the first time, we propose a transformer network-based reinforcement learning (RL) method for power distribution network (PDN) optimization of high bandwidth memory (HBM). The proposed method can provide an optimal decoupling capacitor (decap) design to maximize the reduction of PDN self- and transfer impedances seen at multiple ports. An attention-based transformer network is implemented to directly parameterize decap optimization policy. The optimality performance is significantly improved since the attention mechanism has powerful expression to explore massive combinatorial space for decap assignments. Moreover, it can capture sequential relationships between the decap assignments. The computing time for optimization is dramatically reduced due to the reusable network on the positions of probing ports and decap assignment candidates. This is because the transformer network has a context embedding process to capture meta-features including probing ports positions. In addition, the network is trained with randomly generated datasets. The computing time for training and data cost are critically decreased due to the scalability of the network. Due to its shared weight property and the context embedding process, the network can adapt to a larger scale of problems without additional training. For verification, the results are compared with conventional genetic algorithm (GA), random search (RS), and all the previous RL-based methods. As a result, the proposed method outperforms in all the following aspects: optimality performance, computing time, and data efficiency.
ISSN:	0018-9480 1557-9670
DOI:	10.1109/TMTT.2022.3202221