Loading…

Considering all starting points for simultaneous multithreading simulation

Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run o...

Full description

Saved in:
Bibliographic Details
Main Authors: Van Biesbrouckt, M., Eeckhout, L., Calder, B.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 153
container_issue
container_start_page 143
container_title
container_volume
creator Van Biesbrouckt, M.
Eeckhout, L.
Calder, B.
description Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes.
doi_str_mv 10.1109/ISPASS.2006.1620799
format conference_proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1620799</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1620799</ieee_id><sourcerecordid>1620799</sourcerecordid><originalsourceid>FETCH-LOGICAL-i220t-c62084b6b9edd38f70e7a192b8405eb17e402d4048b071cf31fafb95cb74c9003</originalsourceid><addsrcrecordid>eNotT8tqwzAQFJRC2zRfkIt_wO5Kli3pGEwfKYEWnJyDZK9aFUcOknLo39duspeZYYZhlpAVhYJSUE-b9nPdtgUDqAtaMxBK3ZAHyhnnQGUNd2QZ4w9MV6pK8vqevDejj67H4PxXpochi0mHNIvT6HyKmR1DFt3xPCTtcTzHbKYufQfU_Rz793Ryo38kt1YPEZdXXJD9y_Ouecu3H6-bZr3NHWOQ8m7aJbmpjcK-L6UVgEJTxYzkUKGhAjmwngOXBgTtbEmttkZVnRG8U9PyBVldeh0iHk7BHXX4PVzfLf8A3AtNXw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Considering all starting points for simultaneous multithreading simulation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Van Biesbrouckt, M. ; Eeckhout, L. ; Calder, B.</creator><creatorcontrib>Van Biesbrouckt, M. ; Eeckhout, L. ; Calder, B.</creatorcontrib><description>Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes.</description><identifier>ISBN: 1424401860</identifier><identifier>ISBN: 9781424401864</identifier><identifier>DOI: 10.1109/ISPASS.2006.1620799</identifier><language>eng</language><publisher>IEEE</publisher><subject>Analytical models ; Computational modeling ; Computer architecture ; Computer simulation ; Microarchitecture ; Multithreading ; Phase estimation ; Surface-mount technology ; Throughput ; Yarn</subject><ispartof>2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006, p.143-153</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1620799$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,4048,4049,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1620799$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Van Biesbrouckt, M.</creatorcontrib><creatorcontrib>Eeckhout, L.</creatorcontrib><creatorcontrib>Calder, B.</creatorcontrib><title>Considering all starting points for simultaneous multithreading simulation</title><title>2006 IEEE International Symposium on Performance Analysis of Systems and Software</title><addtitle>ISPASS</addtitle><description>Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes.</description><subject>Analytical models</subject><subject>Computational modeling</subject><subject>Computer architecture</subject><subject>Computer simulation</subject><subject>Microarchitecture</subject><subject>Multithreading</subject><subject>Phase estimation</subject><subject>Surface-mount technology</subject><subject>Throughput</subject><subject>Yarn</subject><isbn>1424401860</isbn><isbn>9781424401864</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotT8tqwzAQFJRC2zRfkIt_wO5Kli3pGEwfKYEWnJyDZK9aFUcOknLo39duspeZYYZhlpAVhYJSUE-b9nPdtgUDqAtaMxBK3ZAHyhnnQGUNd2QZ4w9MV6pK8vqevDejj67H4PxXpochi0mHNIvT6HyKmR1DFt3xPCTtcTzHbKYufQfU_Rz793Ryo38kt1YPEZdXXJD9y_Ouecu3H6-bZr3NHWOQ8m7aJbmpjcK-L6UVgEJTxYzkUKGhAjmwngOXBgTtbEmttkZVnRG8U9PyBVldeh0iHk7BHXX4PVzfLf8A3AtNXw</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Van Biesbrouckt, M.</creator><creator>Eeckhout, L.</creator><creator>Calder, B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2006</creationdate><title>Considering all starting points for simultaneous multithreading simulation</title><author>Van Biesbrouckt, M. ; Eeckhout, L. ; Calder, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i220t-c62084b6b9edd38f70e7a192b8405eb17e402d4048b071cf31fafb95cb74c9003</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Analytical models</topic><topic>Computational modeling</topic><topic>Computer architecture</topic><topic>Computer simulation</topic><topic>Microarchitecture</topic><topic>Multithreading</topic><topic>Phase estimation</topic><topic>Surface-mount technology</topic><topic>Throughput</topic><topic>Yarn</topic><toplevel>online_resources</toplevel><creatorcontrib>Van Biesbrouckt, M.</creatorcontrib><creatorcontrib>Eeckhout, L.</creatorcontrib><creatorcontrib>Calder, B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Van Biesbrouckt, M.</au><au>Eeckhout, L.</au><au>Calder, B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Considering all starting points for simultaneous multithreading simulation</atitle><btitle>2006 IEEE International Symposium on Performance Analysis of Systems and Software</btitle><stitle>ISPASS</stitle><date>2006</date><risdate>2006</risdate><spage>143</spage><epage>153</epage><pages>143-153</pages><isbn>1424401860</isbn><isbn>9781424401864</isbn><abstract>Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes.</abstract><pub>IEEE</pub><doi>10.1109/ISPASS.2006.1620799</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 1424401860
ispartof 2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006, p.143-153
issn
language eng
recordid cdi_ieee_primary_1620799
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Analytical models
Computational modeling
Computer architecture
Computer simulation
Microarchitecture
Multithreading
Phase estimation
Surface-mount technology
Throughput
Yarn
title Considering all starting points for simultaneous multithreading simulation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T08%3A46%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Considering%20all%20starting%20points%20for%20simultaneous%20multithreading%20simulation&rft.btitle=2006%20IEEE%20International%20Symposium%20on%20Performance%20Analysis%20of%20Systems%20and%20Software&rft.au=Van%20Biesbrouckt,%20M.&rft.date=2006&rft.spage=143&rft.epage=153&rft.pages=143-153&rft.isbn=1424401860&rft.isbn_list=9781424401864&rft_id=info:doi/10.1109/ISPASS.2006.1620799&rft_dat=%3Cieee_6IE%3E1620799%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i220t-c62084b6b9edd38f70e7a192b8405eb17e402d4048b071cf31fafb95cb74c9003%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1620799&rfr_iscdi=true