Loading…

Trace cache: A low latency approach to high bandwidth instruction fetching

As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements will also increase. It will become necessary to fetch multiple basic blocks per cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguou...

Full description

Saved in:
Bibliographic Details
Main Authors: Rotenberg, Eric, Bennett, Steve, Smith, James E
Format: Conference Proceeding
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 34
container_issue
container_start_page 24
container_title
container_volume
creator Rotenberg, Eric
Bennett, Steve
Smith, James E
description As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements will also increase. It will become necessary to fetch multiple basic blocks per cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. We propose supplementing the conventional instruction cache with a trace cache. This structure caches traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear contiguous. For the Instruction Benchmark Suite (IBS) and SPEC92 integer benchmarks, a 4 kilobyte trace cache improves performance on average by 28% over conventional sequential fetching. Further, it is shown that the trace cache's efficient, low latency approach enables it to outperform more complex mechanisms that work solely out of the instruction cache.
format conference_proceeding
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_miscellaneous_26412751</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>26412751</sourcerecordid><originalsourceid>FETCH-LOGICAL-p99t-168891e4c3a13c3a59788c13f5bff1fa8a94f47315569003fe447367c283a7d03</originalsourceid><addsrcrecordid>eNotjktLAzEYRbNQsFb_Q1buBpLJ210pPim4mX35mkmaSEzGSYbiv3dAN_dyuHC4V2hDieo7zgW9Qbe1fhJCtDRig96HGazDFmxwj3iHU7ngBM1l-4NhmuayDrgVHOI54BPk8RLHFnDMtc2LbbFk7F2zIebzHbr2kKq7_-8tGp6fhv1rd_h4edvvDt1kTOuo1NpQxy0DytYQRmltKfPi5D31oMFwzxWjQkhDCPOOrySV7TUDNRK2RQ9_2vXc9-JqO37Fal1KkF1Z6rGXnPZKUPYLknRIYw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>26412751</pqid></control><display><type>conference_proceeding</type><title>Trace cache: A low latency approach to high bandwidth instruction fetching</title><source>IEEE Xplore All Conference Series</source><creator>Rotenberg, Eric ; Bennett, Steve ; Smith, James E</creator><creatorcontrib>Rotenberg, Eric ; Bennett, Steve ; Smith, James E</creatorcontrib><description>As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements will also increase. It will become necessary to fetch multiple basic blocks per cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. We propose supplementing the conventional instruction cache with a trace cache. This structure caches traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear contiguous. For the Instruction Benchmark Suite (IBS) and SPEC92 integer benchmarks, a 4 kilobyte trace cache improves performance on average by 28% over conventional sequential fetching. Further, it is shown that the trace cache's efficient, low latency approach enables it to outperform more complex mechanisms that work solely out of the instruction cache.</description><identifier>ISSN: 1072-4451</identifier><language>eng</language><ispartof>Proceedings of the annual International Symposium on Microarchitecture, 1996, p.24-34</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784</link.rule.ids></links><search><creatorcontrib>Rotenberg, Eric</creatorcontrib><creatorcontrib>Bennett, Steve</creatorcontrib><creatorcontrib>Smith, James E</creatorcontrib><title>Trace cache: A low latency approach to high bandwidth instruction fetching</title><title>Proceedings of the annual International Symposium on Microarchitecture</title><description>As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements will also increase. It will become necessary to fetch multiple basic blocks per cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. We propose supplementing the conventional instruction cache with a trace cache. This structure caches traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear contiguous. For the Instruction Benchmark Suite (IBS) and SPEC92 integer benchmarks, a 4 kilobyte trace cache improves performance on average by 28% over conventional sequential fetching. Further, it is shown that the trace cache's efficient, low latency approach enables it to outperform more complex mechanisms that work solely out of the instruction cache.</description><issn>1072-4451</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>1996</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotjktLAzEYRbNQsFb_Q1buBpLJ210pPim4mX35mkmaSEzGSYbiv3dAN_dyuHC4V2hDieo7zgW9Qbe1fhJCtDRig96HGazDFmxwj3iHU7ngBM1l-4NhmuayDrgVHOI54BPk8RLHFnDMtc2LbbFk7F2zIebzHbr2kKq7_-8tGp6fhv1rd_h4edvvDt1kTOuo1NpQxy0DytYQRmltKfPi5D31oMFwzxWjQkhDCPOOrySV7TUDNRK2RQ9_2vXc9-JqO37Fal1KkF1Z6rGXnPZKUPYLknRIYw</recordid><startdate>19960101</startdate><enddate>19960101</enddate><creator>Rotenberg, Eric</creator><creator>Bennett, Steve</creator><creator>Smith, James E</creator><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>19960101</creationdate><title>Trace cache: A low latency approach to high bandwidth instruction fetching</title><author>Rotenberg, Eric ; Bennett, Steve ; Smith, James E</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p99t-168891e4c3a13c3a59788c13f5bff1fa8a94f47315569003fe447367c283a7d03</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>1996</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Rotenberg, Eric</creatorcontrib><creatorcontrib>Bennett, Steve</creatorcontrib><creatorcontrib>Smith, James E</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rotenberg, Eric</au><au>Bennett, Steve</au><au>Smith, James E</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Trace cache: A low latency approach to high bandwidth instruction fetching</atitle><btitle>Proceedings of the annual International Symposium on Microarchitecture</btitle><date>1996-01-01</date><risdate>1996</risdate><spage>24</spage><epage>34</epage><pages>24-34</pages><issn>1072-4451</issn><abstract>As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements will also increase. It will become necessary to fetch multiple basic blocks per cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. We propose supplementing the conventional instruction cache with a trace cache. This structure caches traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear contiguous. For the Instruction Benchmark Suite (IBS) and SPEC92 integer benchmarks, a 4 kilobyte trace cache improves performance on average by 28% over conventional sequential fetching. Further, it is shown that the trace cache's efficient, low latency approach enables it to outperform more complex mechanisms that work solely out of the instruction cache.</abstract><tpages>11</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1072-4451
ispartof Proceedings of the annual International Symposium on Microarchitecture, 1996, p.24-34
issn 1072-4451
language eng
recordid cdi_proquest_miscellaneous_26412751
source IEEE Xplore All Conference Series
title Trace cache: A low latency approach to high bandwidth instruction fetching
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T06%3A00%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Trace%20cache:%20A%20low%20latency%20approach%20to%20high%20bandwidth%20instruction%20fetching&rft.btitle=Proceedings%20of%20the%20annual%20International%20Symposium%20on%20Microarchitecture&rft.au=Rotenberg,%20Eric&rft.date=1996-01-01&rft.spage=24&rft.epage=34&rft.pages=24-34&rft.issn=1072-4451&rft_id=info:doi/&rft_dat=%3Cproquest%3E26412751%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p99t-168891e4c3a13c3a59788c13f5bff1fa8a94f47315569003fe447367c283a7d03%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=26412751&rft_id=info:pmid/&rfr_iscdi=true