Loading…

Improving Learning-Based DAG Scheduling by Inserting Deliberate Idle Slots

The increasing demands of computing capabilities make it expensive to operate a large-scale cloud cluster. A good scheduling algorithm should be able to reduce the average job completion time (JCT), which is the time duration between a job's arrival and its termination. However, when considerin...

Full description

Saved in:
Bibliographic Details
Published in:IEEE network 2021-11, Vol.35 (6), p.133-139
Main Authors: Duan, Yubin, Wu, Jie
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing demands of computing capabilities make it expensive to operate a large-scale cloud cluster. A good scheduling algorithm should be able to reduce the average job completion time (JCT), which is the time duration between a job's arrival and its termination. However, when considering the precedence constraint of stages in each job, and when jobs arrive online, designing a scheduler to minimize the average JCT is challenging. Counterintuitively, we find that inserting idle time before some jobs might reduce the JCT, which is ignored by many schedulers. The state-of-the-art scheduler, which uses reinforcement learning (RL) techniques to solve scheduling problems, does not consider deliberate idle time. We integrate our observations to the RL agent and let the agent learn the best length of idle time. We carefully design the features used in RL. The shape of each job DAG is captured by the critical path length and the average width, and the detailed precedence constraints in each job DAG are extracted by graph neural networks. The experiment results on both synthetic and realworld datasets show that inserting the deliberate idle time could reduce the average JCT. Also, the results illustrate the significant contribution made by our proposed features.
ISSN:0890-8044
1558-156X
DOI:10.1109/MNET.001.2100231