Loading…

VM Scaling and Load Balancing via Cost Optimal MDP Solution

Dynamic resource allocation mechanism is an essential building block in contemporary cloud computing environment, enabling the support of the large variability of incoming requests from an enormous number of applications utilizing such cloud infrastructure. In this article, we devise a dynamic resou...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on cloud computing 2022-07, Vol.10 (3), p.2219-2237
Main Authors: Shifrin, Mark, Mitrany, Roy, Biton, Erez, Gurewitz, Omer
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dynamic resource allocation mechanism is an essential building block in contemporary cloud computing environment, enabling the support of the large variability of incoming requests from an enormous number of applications utilizing such cloud infrastructure. In this article, we devise a dynamic resource allocation mechanism that optimizes the application's profit under the set of costs and revenues while maintaining performance constraints. Specifically, we devise a decision-maker (DM) agent which formulates the joint admission control, scaling and load balancing problem as a stochastic process solvable by Markov decision process (MDP) which provides the optimal policy. Accordingly, at each time instance, the DM can determine based on the system's current state, on the set of requirements and on the set of costs, whether to add or release a VM (scale-out or scale-in, respectively), whether to admit or reject an upcoming task and if admitting it, which VM to allocate it to. We explore the value function structure and provide insights with respect to the optimal policies produced from it. To address scalability issues of the detailed MDP solution we provide an alternative solution by abstract MDP which consolidates multiple system states into a single abstract state, hence can cope with much larger systems at the expense of slight performance degradation. To demonstrate the feasibility of the suggested scheme, we designed and implemented it, alongside with two traditional auto-scalers, on the Amazon Web Services (AWS) infrastructure. We ran numerous MATLAB simulations and AWS-based experiments which provided insights and demonstrated superiority against the traditional policies we compared with.
ISSN:2168-7161
2168-7161
2372-0018
DOI:10.1109/TCC.2020.3000956