Loading…
Speculative execution for hiding memory latency
L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future i...
Saved in:
Published in: | Computer architecture news 2005-06, Vol.33 (3), p.49-56 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453 |
---|---|
cites | cdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453 |
container_end_page | 56 |
container_issue | 3 |
container_start_page | 49 |
container_title | Computer architecture news |
container_volume | 33 |
creator | Pajuelo, Alex González, Antonio Valero, Mateo |
description | L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement. |
doi_str_mv | 10.1145/1101868.1101877 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_29479671</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>29479671</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</originalsourceid><addsrcrecordid>eNotkLtOxDAURF2AxLJQ06aiy8Y3jmO7RCte0koUQG05zjUY5YWdIPL3hN1UMyMdTXEIuQG6Ayh4BkBBlnJ3TCHOyIZCyVKuyuKCXMb4RZctGN2Q7HVAOzVm9D-Y4O_SR993ietD8ulr330kLbZ9mJMFwc7OV-TcmSbi9Zpb8v5w_7Z_Sg8vj8_7u0NqAQSknKIEV1HFcqG45Jwb5iTWBUhZM1pJVudcSSO4q50SKIu8EoJjqay0puBsS25Pv0PovyeMo259tNg0psN-ijpXhVClgAXMTqANfYwBnR6Cb02YNVD9L0OvMvQqg_0BrgVSNA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>29479671</pqid></control><display><type>article</type><title>Speculative execution for hiding memory latency</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Pajuelo, Alex ; González, Antonio ; Valero, Mateo</creator><creatorcontrib>Pajuelo, Alex ; González, Antonio ; Valero, Mateo</creatorcontrib><description>L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement.</description><identifier>ISSN: 0163-5964</identifier><identifier>DOI: 10.1145/1101868.1101877</identifier><language>eng</language><ispartof>Computer architecture news, 2005-06, Vol.33 (3), p.49-56</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</citedby><cites>FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Pajuelo, Alex</creatorcontrib><creatorcontrib>González, Antonio</creatorcontrib><creatorcontrib>Valero, Mateo</creatorcontrib><title>Speculative execution for hiding memory latency</title><title>Computer architecture news</title><description>L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement.</description><issn>0163-5964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNotkLtOxDAURF2AxLJQ06aiy8Y3jmO7RCte0koUQG05zjUY5YWdIPL3hN1UMyMdTXEIuQG6Ayh4BkBBlnJ3TCHOyIZCyVKuyuKCXMb4RZctGN2Q7HVAOzVm9D-Y4O_SR993ietD8ulr330kLbZ9mJMFwc7OV-TcmSbi9Zpb8v5w_7Z_Sg8vj8_7u0NqAQSknKIEV1HFcqG45Jwb5iTWBUhZM1pJVudcSSO4q50SKIu8EoJjqay0puBsS25Pv0PovyeMo259tNg0psN-ijpXhVClgAXMTqANfYwBnR6Cb02YNVD9L0OvMvQqg_0BrgVSNA</recordid><startdate>200506</startdate><enddate>200506</enddate><creator>Pajuelo, Alex</creator><creator>González, Antonio</creator><creator>Valero, Mateo</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>200506</creationdate><title>Speculative execution for hiding memory latency</title><author>Pajuelo, Alex ; González, Antonio ; Valero, Mateo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Pajuelo, Alex</creatorcontrib><creatorcontrib>González, Antonio</creatorcontrib><creatorcontrib>Valero, Mateo</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer architecture news</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pajuelo, Alex</au><au>González, Antonio</au><au>Valero, Mateo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Speculative execution for hiding memory latency</atitle><jtitle>Computer architecture news</jtitle><date>2005-06</date><risdate>2005</risdate><volume>33</volume><issue>3</issue><spage>49</spage><epage>56</epage><pages>49-56</pages><issn>0163-5964</issn><abstract>L2 misses are one of the main causes for stalling the activity in current and future microprocessors.In this paper we present a mechanism to speculatively execute independent instructions of L2-miss loads, even if no entry in the reorder buffer is available. The proposed mechanism generates future instances of instructions that are expected to be independent of the delinquent load. When these dynamic instructions are later fetched, they use the previously precomputed data and directly go to the commit stage without executing.The mechanism replicates strided loads found above the L2-miss load, that produce the data for the target independent instructions. Instructions following the L2-miss load will check if their source operands have been replicated. In this case, multiple speculative instances of them will also be generated.This mechanism is built on top of a superscalar processor with an aggressive prefetch scheme. Compared to this baseline, the mechanism obtains 21% of performance improvement.</abstract><doi>10.1145/1101868.1101877</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0163-5964 |
ispartof | Computer architecture news, 2005-06, Vol.33 (3), p.49-56 |
issn | 0163-5964 |
language | eng |
recordid | cdi_proquest_miscellaneous_29479671 |
source | Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list) |
title | Speculative execution for hiding memory latency |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T07%3A31%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Speculative%20execution%20for%20hiding%20memory%20latency&rft.jtitle=Computer%20architecture%20news&rft.au=Pajuelo,%20Alex&rft.date=2005-06&rft.volume=33&rft.issue=3&rft.spage=49&rft.epage=56&rft.pages=49-56&rft.issn=0163-5964&rft_id=info:doi/10.1145/1101868.1101877&rft_dat=%3Cproquest_cross%3E29479671%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c1171-50e81fb09327958555a3f8ed4188d30b83d2598a75fdf97e842b775e69c8ca453%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=29479671&rft_id=info:pmid/&rfr_iscdi=true |