Loading…

Energy-Efficient GPU-Intensive Workload Scheduling for Data Centers

Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at...

Full description

Saved in:
Bibliographic Details
Main Authors: Smith, Matthew, Zhao, Luke, Cordova, Jonathan, Jiang, Xunfei, Ebrahimi, Mahdi
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at reducing energy consumption through thermal-aware workload distribution to build energy-efficient data centers for GPU-intensive workload. To achieve this goal, the study employs the GpuCloudSim Plus simulator, which effectively models the dis-tribution of GPU-intensive applications under diverse workloads and utilizations. The integration of machine learning models allows for accurate temperature predictions and comprehensive evaluation of the proposed algorithm's performance. We pro-posed a new workload scheduling algorithm, ThermalAwareGpu, to reduce the energy cost for GPU-intensive workload. We evaluated our algorithm by generating three common patterns of workloads, and saved up to 12.79 % of computing cost compared to the baseline algorithms. Our future work includes exploring the estimation of data center cooling energy and conducting in-depth comparisons of different workload balancing algorithms on various compute-intensive workloads.
ISSN:1946-0759
DOI:10.1109/ICMLA58977.2023.00263