Loading…
Control of Fab Lifters via Deep Reinforcement Learning: A Semi-MDP Approach
In multi-floor fabrication facilities (fabs), lifters' efficient transportation of resources is crucial for the overall productivity of the modern semiconductor industry. Unfortunately, most of the existing methods for controlling fab lifters have difficulty obtaining exact system models and ap...
Saved in:
Published in: | IEEE transactions on automation science and engineering 2024-10, Vol.21 (4), p.5136-5148 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In multi-floor fabrication facilities (fabs), lifters' efficient transportation of resources is crucial for the overall productivity of the modern semiconductor industry. Unfortunately, most of the existing methods for controlling fab lifters have difficulty obtaining exact system models and applying traditional numerical schemes. In this paper, we propose two off-policy deep reinforcement learning algorithms that can learn to efficiently control the lifters in complex multi-floor fab environments with limited prior knowledge. The proposed algorithms exploit a novel semi-Markov decision process (semi-MDP) model and resolve several challenges that arise from the complex structures of multi-floor fabs to achieve highly efficient transportation. Extensive empirical analyses confirm that the controllers trained with our method automatically learn the intricate structure of the problem and effectively react to the real-time information flow of the multi-floor fab without referring to any domain knowledge. We empirically show that the proposed methods outperform advanced techniques including model predictive control and proximal policy optimization in terms of the computation time and transportation efficiency in various simulated fab environments. Note to Practitioners-In terms of implementation, our method is readily integrated into any fab that tracks real-time operational data, which is a common requirement in modern fabs. The method is particularly well-suited for facilities where the manufacturing processes are consistent over time. The crucial aspect of the lifter control problem is that the arrival of lots is independent of the lifter's behavior. This makes it possible to collect data from real-world fabs and build multiple scenarios for training the reinforcement learning controllers. Furthermore, the robustness of the proposed method to small perturbations in the scenarios makes it applicable even when the frequencies of the lot arrivals change slightly. However, if there is a significant shift in lot arrival frequencies from the training data, the method may not perform as expected and may need to be extended using an advanced technique such as meta-reinforcement learning. |
---|---|
ISSN: | 1545-5955 1558-3783 |
DOI: | 10.1109/TASE.2023.3308849 |