Loading…
LIBRA: Clearing the Cloud Through Dynamic Memory Bandwidth Management
Modern Cloud Service Providers (CSP) heavily co-schedule tasks with different priorities on the same computing node to increase server utilization. To ensure the performance of high priority jobs, CSPs usually employ Quality-of-Service (QoS) mechanisms to manage or regulate the usage of shared hardw...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Modern Cloud Service Providers (CSP) heavily co-schedule tasks with different priorities on the same computing node to increase server utilization. To ensure the performance of high priority jobs, CSPs usually employ Quality-of-Service (QoS) mechanisms to manage or regulate the usage of shared hardware resources. Among the critical shared hardware resources, there has been very limited analysis on effective sharing of memory bandwidth among co-scheduled jobs, mainly for two reasons: (1) The correlation between application performance and its memory bandwidth allocation is complicated. (2) An effective hardware throttling mechanism for precise memory bandwidth control is unavailable. These limitations drive CSPs to design conservative policies to ensure the performance of the high priority tasks, which significantly degrades the throughput of batch jobs and reduces the overall benefits of workload co-scheduling. This paper proposes LIBRA, a holistic framework for dynamic memory bandwidth management in production data centers. LIBRA incorporates a novel hardware throttling mechanism, Dynamic Resource Control, to support self-adaptive memory bandwidth regulation. It also employs a lightweight control policy to further enhance the bandwidth scalability for the throttled tasks. Our evaluation results on a cluster demonstrate that LIBRA is capable of increasing the performance of batch jobs by up to 52.8% compared to existing QoS schemes. |
---|---|
ISSN: | 2378-203X |
DOI: | 10.1109/HPCA51647.2021.00073 |