Loading…

Speculative execution for hiding memory latency

L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future i...

Full description

Saved in:
Bibliographic Details
Published in:Computer architecture news 2005-06, Vol.33 (3), p.49-56
Main Authors: Pajuelo, Alex, González, Antonio, Valero, Mateo
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453
cites cdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453
container_end_page 56
container_issue 3
container_start_page 49
container_title Computer architecture news
container_volume 33
creator Pajuelo, Alex
González, Antonio
Valero, Mateo
description L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement.
doi_str_mv 10.1145/1101868.1101877
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_29479671</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>29479671</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</originalsourceid><addsrcrecordid>eNotkLtOxDAURF2AxLJQ06aiy8Y3jmO7RCte0koUQG05zjUY5YWdIPL3hN1UMyMdTXEIuQG6Ayh4BkBBlnJ3TCHOyIZCyVKuyuKCXMb4RZctGN2Q7HVAOzVm9D-Y4O_SR993ietD8ulr330kLbZ9mJMFwc7OV-TcmSbi9Zpb8v5w_7Z_Sg8vj8_7u0NqAQSknKIEV1HFcqG45Jwb5iTWBUhZM1pJVudcSSO4q50SKIu8EoJjqay0puBsS25Pv0PovyeMo259tNg0psN-ijpXhVClgAXMTqANfYwBnR6Cb02YNVD9L0OvMvQqg_0BrgVSNA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>29479671</pqid></control><display><type>article</type><title>Speculative execution for hiding memory latency</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Pajuelo, Alex ; González, Antonio ; Valero, Mateo</creator><creatorcontrib>Pajuelo, Alex ; González, Antonio ; Valero, Mateo</creatorcontrib><description>L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement.</description><identifier>ISSN: 0163-5964</identifier><identifier>DOI: 10.1145/1101868.1101877</identifier><language>eng</language><ispartof>Computer architecture news, 2005-06, Vol.33 (3), p.49-56</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</citedby><cites>FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Pajuelo, Alex</creatorcontrib><creatorcontrib>González, Antonio</creatorcontrib><creatorcontrib>Valero, Mateo</creatorcontrib><title>Speculative execution for hiding memory latency</title><title>Computer architecture news</title><description>L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement.</description><issn>0163-5964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNotkLtOxDAURF2AxLJQ06aiy8Y3jmO7RCte0koUQG05zjUY5YWdIPL3hN1UMyMdTXEIuQG6Ayh4BkBBlnJ3TCHOyIZCyVKuyuKCXMb4RZctGN2Q7HVAOzVm9D-Y4O_SR993ietD8ulr330kLbZ9mJMFwc7OV-TcmSbi9Zpb8v5w_7Z_Sg8vj8_7u0NqAQSknKIEV1HFcqG45Jwb5iTWBUhZM1pJVudcSSO4q50SKIu8EoJjqay0puBsS25Pv0PovyeMo259tNg0psN-ijpXhVClgAXMTqANfYwBnR6Cb02YNVD9L0OvMvQqg_0BrgVSNA</recordid><startdate>200506</startdate><enddate>200506</enddate><creator>Pajuelo, Alex</creator><creator>González, Antonio</creator><creator>Valero, Mateo</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200506</creationdate><title>Speculative execution for hiding memory latency</title><author>Pajuelo, Alex ; González, Antonio ; Valero, Mateo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Pajuelo, Alex</creatorcontrib><creatorcontrib>González, Antonio</creatorcontrib><creatorcontrib>Valero, Mateo</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer architecture news</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pajuelo, Alex</au><au>González, Antonio</au><au>Valero, Mateo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Speculative execution for hiding memory latency</atitle><jtitle>Computer architecture news</jtitle><date>2005-06</date><risdate>2005</risdate><volume>33</volume><issue>3</issue><spage>49</spage><epage>56</epage><pages>49-56</pages><issn>0163-5964</issn><abstract>L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement.</abstract><doi>10.1145/1101868.1101877</doi><tpages>8</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0163-5964
ispartof Computer architecture news, 2005-06, Vol.33 (3), p.49-56
issn 0163-5964
language eng
recordid cdi_proquest_miscellaneous_29479671
source Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
title Speculative execution for hiding memory latency
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T07%3A31%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Speculative%20execution%20for%20hiding%20memory%20latency&rft.jtitle=Computer%20architecture%20news&rft.au=Pajuelo,%20Alex&rft.date=2005-06&rft.volume=33&rft.issue=3&rft.spage=49&rft.epage=56&rft.pages=49-56&rft.issn=0163-5964&rft_id=info:doi/10.1145/1101868.1101877&rft_dat=%3Cproquest_cross%3E29479671%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=29479671&rft_id=info:pmid/&rfr_iscdi=true