Loading…
Considering all starting points for simultaneous multithreading simulation
Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run o...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 153 |
container_issue | |
container_start_page | 143 |
container_title | |
container_volume | |
creator | Van Biesbrouckt, M. Eeckhout, L. Calder, B. |
description | Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes. |
doi_str_mv | 10.1109/ISPASS.2006.1620799 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_1620799</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1620799</ieee_id><sourcerecordid>1620799</sourcerecordid><originalsourceid>FETCH-LOGICAL-i220t-c62084b6b9edd38f70e7a192b8405eb17e402d4048b071cf31fafb95cb74c9003</originalsourceid><addsrcrecordid>eNotT8tqwzAQFJRC2zRfkIt_wO5Kli3pGEwfKYEWnJyDZK9aFUcOknLo39duspeZYYZhlpAVhYJSUE-b9nPdtgUDqAtaMxBK3ZAHyhnnQGUNd2QZ4w9MV6pK8vqevDejj67H4PxXpochi0mHNIvT6HyKmR1DFt3xPCTtcTzHbKYufQfU_Rz793Ryo38kt1YPEZdXXJD9y_Ouecu3H6-bZr3NHWOQ8m7aJbmpjcK-L6UVgEJTxYzkUKGhAjmwngOXBgTtbEmttkZVnRG8U9PyBVldeh0iHk7BHXX4PVzfLf8A3AtNXw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Considering all starting points for simultaneous multithreading simulation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Van Biesbrouckt, M. ; Eeckhout, L. ; Calder, B.</creator><creatorcontrib>Van Biesbrouckt, M. ; Eeckhout, L. ; Calder, B.</creatorcontrib><description>Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes.</description><identifier>ISBN: 1424401860</identifier><identifier>ISBN: 9781424401864</identifier><identifier>DOI: 10.1109/ISPASS.2006.1620799</identifier><language>eng</language><publisher>IEEE</publisher><subject>Analytical models ; Computational modeling ; Computer architecture ; Computer simulation ; Microarchitecture ; Multithreading ; Phase estimation ; Surface-mount technology ; Throughput ; Yarn</subject><ispartof>2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006, p.143-153</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1620799$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,4048,4049,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1620799$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Van Biesbrouckt, M.</creatorcontrib><creatorcontrib>Eeckhout, L.</creatorcontrib><creatorcontrib>Calder, B.</creatorcontrib><title>Considering all starting points for simultaneous multithreading simulation</title><title>2006 IEEE International Symposium on Performance Analysis of Systems and Software</title><addtitle>ISPASS</addtitle><description>Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes.</description><subject>Analytical models</subject><subject>Computational modeling</subject><subject>Computer architecture</subject><subject>Computer simulation</subject><subject>Microarchitecture</subject><subject>Multithreading</subject><subject>Phase estimation</subject><subject>Surface-mount technology</subject><subject>Throughput</subject><subject>Yarn</subject><isbn>1424401860</isbn><isbn>9781424401864</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotT8tqwzAQFJRC2zRfkIt_wO5Kli3pGEwfKYEWnJyDZK9aFUcOknLo39duspeZYYZhlpAVhYJSUE-b9nPdtgUDqAtaMxBK3ZAHyhnnQGUNd2QZ4w9MV6pK8vqevDejj67H4PxXpochi0mHNIvT6HyKmR1DFt3xPCTtcTzHbKYufQfU_Rz793Ryo38kt1YPEZdXXJD9y_Ouecu3H6-bZr3NHWOQ8m7aJbmpjcK-L6UVgEJTxYzkUKGhAjmwngOXBgTtbEmttkZVnRG8U9PyBVldeh0iHk7BHXX4PVzfLf8A3AtNXw</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Van Biesbrouckt, M.</creator><creator>Eeckhout, L.</creator><creator>Calder, B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2006</creationdate><title>Considering all starting points for simultaneous multithreading simulation</title><author>Van Biesbrouckt, M. ; Eeckhout, L. ; Calder, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i220t-c62084b6b9edd38f70e7a192b8405eb17e402d4048b071cf31fafb95cb74c9003</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Analytical models</topic><topic>Computational modeling</topic><topic>Computer architecture</topic><topic>Computer simulation</topic><topic>Microarchitecture</topic><topic>Multithreading</topic><topic>Phase estimation</topic><topic>Surface-mount technology</topic><topic>Throughput</topic><topic>Yarn</topic><toplevel>online_resources</toplevel><creatorcontrib>Van Biesbrouckt, M.</creatorcontrib><creatorcontrib>Eeckhout, L.</creatorcontrib><creatorcontrib>Calder, B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Van Biesbrouckt, M.</au><au>Eeckhout, L.</au><au>Calder, B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Considering all starting points for simultaneous multithreading simulation</atitle><btitle>2006 IEEE International Symposium on Performance Analysis of Systems and Software</btitle><stitle>ISPASS</stitle><date>2006</date><risdate>2006</risdate><spage>143</spage><epage>153</epage><pages>143-153</pages><isbn>1424401860</isbn><isbn>9781424401864</isbn><abstract>Commercial processors have support for simultaneous multithreading (SMT), yet little work has been done to provide representative simulation results for SMT. Given a workload, current simulation techniques typically run one combination of those programs from a specific starting offset, or just run one combination of samples across the benchmarks. We have found that the architecture behavior and overall throughput seen can vary drastically based upon the starting points of the different benchmarks. Therefore, to completely evaluate the effect of an SMT architecture optimization on a workload, one would need to simulate many or all of the program combinations from different starting offsets. But exhaustively running all program combinations from many starting offsets is infeasible - even running single programs to completion is often infeasible with modern benchmarks. In this paper we propose an SMT simulation methodology that estimates the average performance over all possible starting points when running multiple programs concurrently on an SMT processor. This is based on our prior co-phase matrix phase analysis and simulation infrastructure. This approach samples all of the unique phase combinations for a set of benchmarks to be run together. Once these phase combinations are sampled, our approach uses these samples, along with a trace of the phase behavior for each program, to provide a CPI estimate of all starting points. This all starting point CPI estimate is precisely calculated in just minutes.</abstract><pub>IEEE</pub><doi>10.1109/ISPASS.2006.1620799</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 1424401860 |
ispartof | 2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006, p.143-153 |
issn | |
language | eng |
recordid | cdi_ieee_primary_1620799 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Analytical models Computational modeling Computer architecture Computer simulation Microarchitecture Multithreading Phase estimation Surface-mount technology Throughput Yarn |
title | Considering all starting points for simultaneous multithreading simulation |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T08%3A46%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Considering%20all%20starting%20points%20for%20simultaneous%20multithreading%20simulation&rft.btitle=2006%20IEEE%20International%20Symposium%20on%20Performance%20Analysis%20of%20Systems%20and%20Software&rft.au=Van%20Biesbrouckt,%20M.&rft.date=2006&rft.spage=143&rft.epage=153&rft.pages=143-153&rft.isbn=1424401860&rft.isbn_list=9781424401864&rft_id=info:doi/10.1109/ISPASS.2006.1620799&rft_dat=%3Cieee_6IE%3E1620799%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i220t-c62084b6b9edd38f70e7a192b8405eb17e402d4048b071cf31fafb95cb74c9003%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1620799&rfr_iscdi=true |