Loading…

Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters

Co-locating latency-critical services (LCSs) and best-effort jobs (BEJs) constitute the principal approach for enhancing resource utilization in production. Nevertheless, the co-location practice hurts the performance of LCSs due to resource competition, even when employing isolation technology. Thr...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-07
Main Authors: Yang, Dingyu, Zheng, Kangpeng, Qian, Shiyou, Cao, Jian, Xue, Guangtao
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Yang, Dingyu
Zheng, Kangpeng
Qian, Shiyou
Cao, Jian
Xue, Guangtao
description Co-locating latency-critical services (LCSs) and best-effort jobs (BEJs) constitute the principal approach for enhancing resource utilization in production. Nevertheless, the co-location practice hurts the performance of LCSs due to resource competition, even when employing isolation technology. Through an extensive analysis of voluminous real trace data derived from two production clusters, we observe that BEJs typically exhibit periodic execution patterns and serve as the primary sources of interference to LCSs. Furthermore, despite occupying the same level of resource consumption, the diverse compositions of BEJs can result in varying degrees of interference on LCSs. Subsequently, we propose PISM, a proactive Performance Interference Scoring and Mitigating framework for LCSs through the optimization of BEJ scheduling. Firstly, PISM adopts a data-driven approach to establish a characterization and classification methodology for BEJs. Secondly, PISM models the relationship between the composition of BEJs on servers and the response time (RT) of LCSs. Thirdly, PISM establishes an interference scoring mechanism in terms of RT, which serves as the foundation for BEJ scheduling. We assess the effectiveness of PISM on a small-scale cluster and through extensive data-driven simulations. The experiment results demonstrate that PISM can reduce cluster interference by up to 41.5%, and improve the throughput of long-tail LCSs by 76.4%.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3082383550</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3082383550</sourcerecordid><originalsourceid>FETCH-proquest_journals_30823835503</originalsourceid><addsrcrecordid>eNqNjEEKwjAQAIMgWLR_WPBciInV3ouiYE968SQhbNuUmuhuqt-3gg_wNIcZZiISpfUqK9ZKzUTK3Ekp1War8lwn4lq56BoTnW_g6CNSjYTeIoQaKmcpMNLLWWR4u9iCgbMN9I0rtK3xju_gPJwMNZixNT1C2Q88fnghprXpGdMf52K5313KQ_ag8ByQ460LA_lR3bQslC50nkv9X_UB-WRCVQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3082383550</pqid></control><display><type>article</type><title>Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters</title><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Yang, Dingyu ; Zheng, Kangpeng ; Qian, Shiyou ; Cao, Jian ; Xue, Guangtao</creator><creatorcontrib>Yang, Dingyu ; Zheng, Kangpeng ; Qian, Shiyou ; Cao, Jian ; Xue, Guangtao</creatorcontrib><description>Co-locating latency-critical services (LCSs) and best-effort jobs (BEJs) constitute the principal approach for enhancing resource utilization in production. Nevertheless, the co-location practice hurts the performance of LCSs due to resource competition, even when employing isolation technology. Through an extensive analysis of voluminous real trace data derived from two production clusters, we observe that BEJs typically exhibit periodic execution patterns and serve as the primary sources of interference to LCSs. Furthermore, despite occupying the same level of resource consumption, the diverse compositions of BEJs can result in varying degrees of interference on LCSs. Subsequently, we propose PISM, a proactive Performance Interference Scoring and Mitigating framework for LCSs through the optimization of BEJ scheduling. Firstly, PISM adopts a data-driven approach to establish a characterization and classification methodology for BEJs. Secondly, PISM models the relationship between the composition of BEJs on servers and the response time (RT) of LCSs. Thirdly, PISM establishes an interference scoring mechanism in terms of RT, which serves as the foundation for BEJ scheduling. We assess the effectiveness of PISM on a small-scale cluster and through extensive data-driven simulations. The experiment results demonstrate that PISM can reduce cluster interference by up to 41.5%, and improve the throughput of long-tail LCSs by 76.4%.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cluster analysis ; Clusters ; Composition ; Performance evaluation ; Resource utilization ; Scheduling ; Technology assessment</subject><ispartof>arXiv.org, 2024-07</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3082383550?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Yang, Dingyu</creatorcontrib><creatorcontrib>Zheng, Kangpeng</creatorcontrib><creatorcontrib>Qian, Shiyou</creatorcontrib><creatorcontrib>Cao, Jian</creatorcontrib><creatorcontrib>Xue, Guangtao</creatorcontrib><title>Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters</title><title>arXiv.org</title><description>Co-locating latency-critical services (LCSs) and best-effort jobs (BEJs) constitute the principal approach for enhancing resource utilization in production. Nevertheless, the co-location practice hurts the performance of LCSs due to resource competition, even when employing isolation technology. Through an extensive analysis of voluminous real trace data derived from two production clusters, we observe that BEJs typically exhibit periodic execution patterns and serve as the primary sources of interference to LCSs. Furthermore, despite occupying the same level of resource consumption, the diverse compositions of BEJs can result in varying degrees of interference on LCSs. Subsequently, we propose PISM, a proactive Performance Interference Scoring and Mitigating framework for LCSs through the optimization of BEJ scheduling. Firstly, PISM adopts a data-driven approach to establish a characterization and classification methodology for BEJs. Secondly, PISM models the relationship between the composition of BEJs on servers and the response time (RT) of LCSs. Thirdly, PISM establishes an interference scoring mechanism in terms of RT, which serves as the foundation for BEJ scheduling. We assess the effectiveness of PISM on a small-scale cluster and through extensive data-driven simulations. The experiment results demonstrate that PISM can reduce cluster interference by up to 41.5%, and improve the throughput of long-tail LCSs by 76.4%.</description><subject>Cluster analysis</subject><subject>Clusters</subject><subject>Composition</subject><subject>Performance evaluation</subject><subject>Resource utilization</subject><subject>Scheduling</subject><subject>Technology assessment</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjEEKwjAQAIMgWLR_WPBciInV3ouiYE968SQhbNuUmuhuqt-3gg_wNIcZZiISpfUqK9ZKzUTK3Ekp1War8lwn4lq56BoTnW_g6CNSjYTeIoQaKmcpMNLLWWR4u9iCgbMN9I0rtK3xju_gPJwMNZixNT1C2Q88fnghprXpGdMf52K5313KQ_ag8ByQ460LA_lR3bQslC50nkv9X_UB-WRCVQ</recordid><startdate>20240717</startdate><enddate>20240717</enddate><creator>Yang, Dingyu</creator><creator>Zheng, Kangpeng</creator><creator>Qian, Shiyou</creator><creator>Cao, Jian</creator><creator>Xue, Guangtao</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240717</creationdate><title>Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters</title><author>Yang, Dingyu ; Zheng, Kangpeng ; Qian, Shiyou ; Cao, Jian ; Xue, Guangtao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30823835503</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Cluster analysis</topic><topic>Clusters</topic><topic>Composition</topic><topic>Performance evaluation</topic><topic>Resource utilization</topic><topic>Scheduling</topic><topic>Technology assessment</topic><toplevel>online_resources</toplevel><creatorcontrib>Yang, Dingyu</creatorcontrib><creatorcontrib>Zheng, Kangpeng</creatorcontrib><creatorcontrib>Qian, Shiyou</creatorcontrib><creatorcontrib>Cao, Jian</creatorcontrib><creatorcontrib>Xue, Guangtao</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Dingyu</au><au>Zheng, Kangpeng</au><au>Qian, Shiyou</au><au>Cao, Jian</au><au>Xue, Guangtao</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters</atitle><jtitle>arXiv.org</jtitle><date>2024-07-17</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Co-locating latency-critical services (LCSs) and best-effort jobs (BEJs) constitute the principal approach for enhancing resource utilization in production. Nevertheless, the co-location practice hurts the performance of LCSs due to resource competition, even when employing isolation technology. Through an extensive analysis of voluminous real trace data derived from two production clusters, we observe that BEJs typically exhibit periodic execution patterns and serve as the primary sources of interference to LCSs. Furthermore, despite occupying the same level of resource consumption, the diverse compositions of BEJs can result in varying degrees of interference on LCSs. Subsequently, we propose PISM, a proactive Performance Interference Scoring and Mitigating framework for LCSs through the optimization of BEJ scheduling. Firstly, PISM adopts a data-driven approach to establish a characterization and classification methodology for BEJs. Secondly, PISM models the relationship between the composition of BEJs on servers and the response time (RT) of LCSs. Thirdly, PISM establishes an interference scoring mechanism in terms of RT, which serves as the foundation for BEJ scheduling. We assess the effectiveness of PISM on a small-scale cluster and through extensive data-driven simulations. The experiment results demonstrate that PISM can reduce cluster interference by up to 41.5%, and improve the throughput of long-tail LCSs by 76.4%.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-07
issn 2331-8422
language eng
recordid cdi_proquest_journals_3082383550
source Publicly Available Content Database (Proquest) (PQ_SDU_P3)
subjects Cluster analysis
Clusters
Composition
Performance evaluation
Resource utilization
Scheduling
Technology assessment
title Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T07%3A28%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Mitigating%20Interference%20of%20Microservices%20with%20a%20Scoring%20Mechanism%20in%20Large-scale%20Clusters&rft.jtitle=arXiv.org&rft.au=Yang,%20Dingyu&rft.date=2024-07-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3082383550%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_30823835503%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3082383550&rft_id=info:pmid/&rfr_iscdi=true