Loading…

Energy-Efficient GPU-Intensive Workload Scheduling for Data Centers

Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at...

Full description

Saved in:
Bibliographic Details
Main Authors: Smith, Matthew, Zhao, Luke, Cordova, Jonathan, Jiang, Xunfei, Ebrahimi, Mahdi
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 1740
container_issue
container_start_page 1735
container_title
container_volume
creator Smith, Matthew
Zhao, Luke
Cordova, Jonathan
Jiang, Xunfei
Ebrahimi, Mahdi
description Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at reducing energy consumption through thermal-aware workload distribution to build energy-efficient data centers for GPU-intensive workload. To achieve this goal, the study employs the GpuCloudSim Plus simulator, which effectively models the dis-tribution of GPU-intensive applications under diverse workloads and utilizations. The integration of machine learning models allows for accurate temperature predictions and comprehensive evaluation of the proposed algorithm's performance. We pro-posed a new workload scheduling algorithm, ThermalAwareGpu, to reduce the energy cost for GPU-intensive workload. We evaluated our algorithm by generating three common patterns of workloads, and saved up to 12.79 % of computing cost compared to the baseline algorithms. Our future work includes exploring the estimation of data center cooling energy and conducting in-depth comparisons of different workload balancing algorithms on various compute-intensive workloads.
doi_str_mv 10.1109/ICMLA58977.2023.00263
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10459846</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10459846</ieee_id><sourcerecordid>10459846</sourcerecordid><originalsourceid>FETCH-LOGICAL-i119t-4e33e92f20f3ff95230d275d60ab0603b13aff6b0ab35f6dbfb3e1956b2ae9df3</originalsourceid><addsrcrecordid>eNotjNFKw0AQRVdBsNT8gUJ-IHF2J7vJPJYY20BEQYuPZdPM1NWaShKF_r0FfbocOOcqdaMh1Rroti4fmoUtKM9TAwZTAOPwTEWUU4EWMLOYuXM105S5BHJLlyoax3cAONWOkGaqrHoedsekEgnbwP0UL5_WSd1P3I_hh-PXw_CxP_guft6-cfe9D_0ulsMQ3_nJx-XJ52G8Uhfi9yNH_ztX6_vqpVwlzeOyLhdNErSmKckYkcmIAUERsgahM7ntHPgWHGCr0Yu49oRoxXWttMiarGuNZ-oE5-r67zcw8-ZrCJ9-OG40ZJaKzOEvjXpMdw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Energy-Efficient GPU-Intensive Workload Scheduling for Data Centers</title><source>IEEE Xplore All Conference Series</source><creator>Smith, Matthew ; Zhao, Luke ; Cordova, Jonathan ; Jiang, Xunfei ; Ebrahimi, Mahdi</creator><creatorcontrib>Smith, Matthew ; Zhao, Luke ; Cordova, Jonathan ; Jiang, Xunfei ; Ebrahimi, Mahdi</creatorcontrib><description>Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at reducing energy consumption through thermal-aware workload distribution to build energy-efficient data centers for GPU-intensive workload. To achieve this goal, the study employs the GpuCloudSim Plus simulator, which effectively models the dis-tribution of GPU-intensive applications under diverse workloads and utilizations. The integration of machine learning models allows for accurate temperature predictions and comprehensive evaluation of the proposed algorithm's performance. We pro-posed a new workload scheduling algorithm, ThermalAwareGpu, to reduce the energy cost for GPU-intensive workload. We evaluated our algorithm by generating three common patterns of workloads, and saved up to 12.79 % of computing cost compared to the baseline algorithms. Our future work includes exploring the estimation of data center cooling energy and conducting in-depth comparisons of different workload balancing algorithms on various compute-intensive workloads.</description><identifier>EISSN: 1946-0759</identifier><identifier>EISBN: 9798350345346</identifier><identifier>DOI: 10.1109/ICMLA58977.2023.00263</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Cooling ; Costs ; Data centers ; Energy consumption ; energy-efficient ; GPU ; Machine learning ; Machine learning algorithms ; Scheduling algorithms ; workload management</subject><ispartof>2023 International Conference on Machine Learning and Applications (ICMLA), 2023, p.1735-1740</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10459846$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27924,54554,54931</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10459846$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Smith, Matthew</creatorcontrib><creatorcontrib>Zhao, Luke</creatorcontrib><creatorcontrib>Cordova, Jonathan</creatorcontrib><creatorcontrib>Jiang, Xunfei</creatorcontrib><creatorcontrib>Ebrahimi, Mahdi</creatorcontrib><title>Energy-Efficient GPU-Intensive Workload Scheduling for Data Centers</title><title>2023 International Conference on Machine Learning and Applications (ICMLA)</title><addtitle>ICMLA</addtitle><description>Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at reducing energy consumption through thermal-aware workload distribution to build energy-efficient data centers for GPU-intensive workload. To achieve this goal, the study employs the GpuCloudSim Plus simulator, which effectively models the dis-tribution of GPU-intensive applications under diverse workloads and utilizations. The integration of machine learning models allows for accurate temperature predictions and comprehensive evaluation of the proposed algorithm's performance. We pro-posed a new workload scheduling algorithm, ThermalAwareGpu, to reduce the energy cost for GPU-intensive workload. We evaluated our algorithm by generating three common patterns of workloads, and saved up to 12.79 % of computing cost compared to the baseline algorithms. Our future work includes exploring the estimation of data center cooling energy and conducting in-depth comparisons of different workload balancing algorithms on various compute-intensive workloads.</description><subject>Cooling</subject><subject>Costs</subject><subject>Data centers</subject><subject>Energy consumption</subject><subject>energy-efficient</subject><subject>GPU</subject><subject>Machine learning</subject><subject>Machine learning algorithms</subject><subject>Scheduling algorithms</subject><subject>workload management</subject><issn>1946-0759</issn><isbn>9798350345346</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjNFKw0AQRVdBsNT8gUJ-IHF2J7vJPJYY20BEQYuPZdPM1NWaShKF_r0FfbocOOcqdaMh1Rroti4fmoUtKM9TAwZTAOPwTEWUU4EWMLOYuXM105S5BHJLlyoax3cAONWOkGaqrHoedsekEgnbwP0UL5_WSd1P3I_hh-PXw_CxP_guft6-cfe9D_0ulsMQ3_nJx-XJ52G8Uhfi9yNH_ztX6_vqpVwlzeOyLhdNErSmKckYkcmIAUERsgahM7ntHPgWHGCr0Yu49oRoxXWttMiarGuNZ-oE5-r67zcw8-ZrCJ9-OG40ZJaKzOEvjXpMdw</recordid><startdate>20231215</startdate><enddate>20231215</enddate><creator>Smith, Matthew</creator><creator>Zhao, Luke</creator><creator>Cordova, Jonathan</creator><creator>Jiang, Xunfei</creator><creator>Ebrahimi, Mahdi</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20231215</creationdate><title>Energy-Efficient GPU-Intensive Workload Scheduling for Data Centers</title><author>Smith, Matthew ; Zhao, Luke ; Cordova, Jonathan ; Jiang, Xunfei ; Ebrahimi, Mahdi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i119t-4e33e92f20f3ff95230d275d60ab0603b13aff6b0ab35f6dbfb3e1956b2ae9df3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Cooling</topic><topic>Costs</topic><topic>Data centers</topic><topic>Energy consumption</topic><topic>energy-efficient</topic><topic>GPU</topic><topic>Machine learning</topic><topic>Machine learning algorithms</topic><topic>Scheduling algorithms</topic><topic>workload management</topic><toplevel>online_resources</toplevel><creatorcontrib>Smith, Matthew</creatorcontrib><creatorcontrib>Zhao, Luke</creatorcontrib><creatorcontrib>Cordova, Jonathan</creatorcontrib><creatorcontrib>Jiang, Xunfei</creatorcontrib><creatorcontrib>Ebrahimi, Mahdi</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Smith, Matthew</au><au>Zhao, Luke</au><au>Cordova, Jonathan</au><au>Jiang, Xunfei</au><au>Ebrahimi, Mahdi</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Energy-Efficient GPU-Intensive Workload Scheduling for Data Centers</atitle><btitle>2023 International Conference on Machine Learning and Applications (ICMLA)</btitle><stitle>ICMLA</stitle><date>2023-12-15</date><risdate>2023</risdate><spage>1735</spage><epage>1740</epage><pages>1735-1740</pages><eissn>1946-0759</eissn><eisbn>9798350345346</eisbn><coden>IEEPAD</coden><abstract>Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive workloads. This paper introduces a novel machine learning-based approach that aims at reducing energy consumption through thermal-aware workload distribution to build energy-efficient data centers for GPU-intensive workload. To achieve this goal, the study employs the GpuCloudSim Plus simulator, which effectively models the dis-tribution of GPU-intensive applications under diverse workloads and utilizations. The integration of machine learning models allows for accurate temperature predictions and comprehensive evaluation of the proposed algorithm's performance. We pro-posed a new workload scheduling algorithm, ThermalAwareGpu, to reduce the energy cost for GPU-intensive workload. We evaluated our algorithm by generating three common patterns of workloads, and saved up to 12.79 % of computing cost compared to the baseline algorithms. Our future work includes exploring the estimation of data center cooling energy and conducting in-depth comparisons of different workload balancing algorithms on various compute-intensive workloads.</abstract><pub>IEEE</pub><doi>10.1109/ICMLA58977.2023.00263</doi><tpages>6</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 1946-0759
ispartof 2023 International Conference on Machine Learning and Applications (ICMLA), 2023, p.1735-1740
issn 1946-0759
language eng
recordid cdi_ieee_primary_10459846
source IEEE Xplore All Conference Series
subjects Cooling
Costs
Data centers
Energy consumption
energy-efficient
GPU
Machine learning
Machine learning algorithms
Scheduling algorithms
workload management
title Energy-Efficient GPU-Intensive Workload Scheduling for Data Centers
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T08%3A25%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Energy-Efficient%20GPU-Intensive%20Workload%20Scheduling%20for%20Data%20Centers&rft.btitle=2023%20International%20Conference%20on%20Machine%20Learning%20and%20Applications%20(ICMLA)&rft.au=Smith,%20Matthew&rft.date=2023-12-15&rft.spage=1735&rft.epage=1740&rft.pages=1735-1740&rft.eissn=1946-0759&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ICMLA58977.2023.00263&rft.eisbn=9798350345346&rft_dat=%3Cieee_CHZPO%3E10459846%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i119t-4e33e92f20f3ff95230d275d60ab0603b13aff6b0ab35f6dbfb3e1956b2ae9df3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10459846&rfr_iscdi=true