Loading…

WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs

This paper proposes a new data prefetching technique for Graphics Processing Units (GPUs) called Warp Aware Selective Prefetching (WASP). The main idea of WASP is to dynamically select warps whose progress is slower than that of the current warp as prefetching target warps. Under the in-order instru...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on computers 2018-09, Vol.67 (9), p.1366-1373
Main Authors:	Oh, Yunho, Yoon, Myung Kuk, Park, Jong Hyun, Park, Yongjun, Ro, Won Woo
Format:	Article
Language:	English
Subjects:	cache performance data prefetching Dynamic loads GPGPU Graphics processing units Hardware Message systems Micromechanical devices Monitoring Prefetching Warp warp scheduling
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c289t-3177343e1e9980e29451291d2143e4025a2b59b60808cbcee61664c0ada30013
cites	cdi_FETCH-LOGICAL-c289t-3177343e1e9980e29451291d2143e4025a2b59b60808cbcee61664c0ada30013
container_end_page	1373
container_issue	9
container_start_page	1366
container_title	IEEE transactions on computers
container_volume	67
creator	Oh, Yunho Yoon, Myung Kuk Park, Jong Hyun Park, Yongjun Ro, Won Woo
description	This paper proposes a new data prefetching technique for Graphics Processing Units (GPUs) called Warp Aware Selective Prefetching (WASP). The main idea of WASP is to dynamically select warps whose progress is slower than that of the current warp as prefetching target warps. Under the in-order instruction execution model of GPUs, these prefetching target warps will certainly execute the same load as the current warp. Exploiting that, WASP prefetches the data for prefetching target warps, which allows the prefetched data to be accurately accessed. To simply verify the progress of the warps, WASP monitors the counts of the dynamic load executions for all warps. When a warp executes a load, WASP searches the warps with lower load execution counts than the current warp and generates the prefetch requests for them. In our evaluation, WASP achieves a 16.8 percent speedup compared to the baseline GPU.
doi_str_mv	10.1109/TC.2018.2813379
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2117122576</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8309426</ieee_id><sourcerecordid>2117122576</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-3177343e1e9980e29451291d2143e4025a2b59b60808cbcee61664c0ada30013</originalsourceid><addsrcrecordid>eNo9kEFPwkAQhTdGExE9e_CyiefCzG677XozVdEElUgNx00pA5RAi7uLxn_vEoinycx8bybvMXaN0EME3S_yngDMeiJDKVN9wjqYJGmkdaJOWQfCKtIyhnN24dwKAJQA3WFvk_vx6I6PaU2Vr7-JP5S-5CNLc_LVsm4W_Kf2S_7aNrVv7b7_2DW-3hCflHYbwHZhyTneNnww-nSX7Gxerh1dHWuXFU-PRf4cDd8HL_n9MKpEpn0kMU1lLAlJ6wxI6DhBoXEmMAxjEEkppomeKsggq6YVkUKl4grKWSmDE9llt4ezW9t-7ch5s2p3tgkfjUBMUYgkVYHqH6jKts4FS2Zr601pfw2C2WdmitzsMzPHzILi5qCoieifziToWCj5B0naZPk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2117122576</pqid></control><display><type>article</type><title>WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Oh, Yunho ; Yoon, Myung Kuk ; Park, Jong Hyun ; Park, Yongjun ; Ro, Won Woo</creator><creatorcontrib>Oh, Yunho ; Yoon, Myung Kuk ; Park, Jong Hyun ; Park, Yongjun ; Ro, Won Woo</creatorcontrib><description>This paper proposes a new data prefetching technique for Graphics Processing Units (GPUs) called Warp Aware Selective Prefetching (WASP). The main idea of WASP is to dynamically select warps whose progress is slower than that of the current warp as prefetching target warps. Under the in-order instruction execution model of GPUs, these prefetching target warps will certainly execute the same load as the current warp. Exploiting that, WASP prefetches the data for prefetching target warps, which allows the prefetched data to be accurately accessed. To simply verify the progress of the warps, WASP monitors the counts of the dynamic load executions for all warps. When a warp executes a load, WASP searches the warps with lower load execution counts than the current warp and generates the prefetch requests for them. In our evaluation, WASP achieves a 16.8 percent speedup compared to the baseline GPU.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2018.2813379</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>cache performance ; data prefetching ; Dynamic loads ; GPGPU ; Graphics processing units ; Hardware ; Message systems ; Micromechanical devices ; Monitoring ; Prefetching ; Warp ; warp scheduling</subject><ispartof>IEEE transactions on computers, 2018-09, Vol.67 (9), p.1366-1373</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-3177343e1e9980e29451291d2143e4025a2b59b60808cbcee61664c0ada30013</citedby><cites>FETCH-LOGICAL-c289t-3177343e1e9980e29451291d2143e4025a2b59b60808cbcee61664c0ada30013</cites><orcidid>0000-0001-5390-6445</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8309426$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27922,27923,54794</link.rule.ids></links><search><creatorcontrib>Oh, Yunho</creatorcontrib><creatorcontrib>Yoon, Myung Kuk</creatorcontrib><creatorcontrib>Park, Jong Hyun</creatorcontrib><creatorcontrib>Park, Yongjun</creatorcontrib><creatorcontrib>Ro, Won Woo</creatorcontrib><title>WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>This paper proposes a new data prefetching technique for Graphics Processing Units (GPUs) called Warp Aware Selective Prefetching (WASP). The main idea of WASP is to dynamically select warps whose progress is slower than that of the current warp as prefetching target warps. Under the in-order instruction execution model of GPUs, these prefetching target warps will certainly execute the same load as the current warp. Exploiting that, WASP prefetches the data for prefetching target warps, which allows the prefetched data to be accurately accessed. To simply verify the progress of the warps, WASP monitors the counts of the dynamic load executions for all warps. When a warp executes a load, WASP searches the warps with lower load execution counts than the current warp and generates the prefetch requests for them. In our evaluation, WASP achieves a 16.8 percent speedup compared to the baseline GPU.</description><subject>cache performance</subject><subject>data prefetching</subject><subject>Dynamic loads</subject><subject>GPGPU</subject><subject>Graphics processing units</subject><subject>Hardware</subject><subject>Message systems</subject><subject>Micromechanical devices</subject><subject>Monitoring</subject><subject>Prefetching</subject><subject>Warp</subject><subject>warp scheduling</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNo9kEFPwkAQhTdGExE9e_CyiefCzG677XozVdEElUgNx00pA5RAi7uLxn_vEoinycx8bybvMXaN0EME3S_yngDMeiJDKVN9wjqYJGmkdaJOWQfCKtIyhnN24dwKAJQA3WFvk_vx6I6PaU2Vr7-JP5S-5CNLc_LVsm4W_Kf2S_7aNrVv7b7_2DW-3hCflHYbwHZhyTneNnww-nSX7Gxerh1dHWuXFU-PRf4cDd8HL_n9MKpEpn0kMU1lLAlJ6wxI6DhBoXEmMAxjEEkppomeKsggq6YVkUKl4grKWSmDE9llt4ezW9t-7ch5s2p3tgkfjUBMUYgkVYHqH6jKts4FS2Zr601pfw2C2WdmitzsMzPHzILi5qCoieifziToWCj5B0naZPk</recordid><startdate>20180901</startdate><enddate>20180901</enddate><creator>Oh, Yunho</creator><creator>Yoon, Myung Kuk</creator><creator>Park, Jong Hyun</creator><creator>Park, Yongjun</creator><creator>Ro, Won Woo</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-5390-6445</orcidid></search><sort><creationdate>20180901</creationdate><title>WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs</title><author>Oh, Yunho ; Yoon, Myung Kuk ; Park, Jong Hyun ; Park, Yongjun ; Ro, Won Woo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-3177343e1e9980e29451291d2143e4025a2b59b60808cbcee61664c0ada30013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>cache performance</topic><topic>data prefetching</topic><topic>Dynamic loads</topic><topic>GPGPU</topic><topic>Graphics processing units</topic><topic>Hardware</topic><topic>Message systems</topic><topic>Micromechanical devices</topic><topic>Monitoring</topic><topic>Prefetching</topic><topic>Warp</topic><topic>warp scheduling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Oh, Yunho</creatorcontrib><creatorcontrib>Yoon, Myung Kuk</creatorcontrib><creatorcontrib>Park, Jong Hyun</creatorcontrib><creatorcontrib>Park, Yongjun</creatorcontrib><creatorcontrib>Ro, Won Woo</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library Online</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Oh, Yunho</au><au>Yoon, Myung Kuk</au><au>Park, Jong Hyun</au><au>Park, Yongjun</au><au>Ro, Won Woo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2018-09-01</date><risdate>2018</risdate><volume>67</volume><issue>9</issue><spage>1366</spage><epage>1373</epage><pages>1366-1373</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>This paper proposes a new data prefetching technique for Graphics Processing Units (GPUs) called Warp Aware Selective Prefetching (WASP). The main idea of WASP is to dynamically select warps whose progress is slower than that of the current warp as prefetching target warps. Under the in-order instruction execution model of GPUs, these prefetching target warps will certainly execute the same load as the current warp. Exploiting that, WASP prefetches the data for prefetching target warps, which allows the prefetched data to be accurately accessed. To simply verify the progress of the warps, WASP monitors the counts of the dynamic load executions for all warps. When a warp executes a load, WASP searches the warps with lower load execution counts than the current warp and generates the prefetch requests for them. In our evaluation, WASP achieves a 16.8 percent speedup compared to the baseline GPU.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TC.2018.2813379</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0001-5390-6445</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0018-9340
ispartof	IEEE transactions on computers, 2018-09, Vol.67 (9), p.1366-1373
issn	0018-9340 1557-9956
language	eng
recordid	cdi_proquest_journals_2117122576
source	IEEE Electronic Library (IEL) Journals
subjects	cache performance data prefetching Dynamic loads GPGPU Graphics processing units Hardware Message systems Micromechanical devices Monitoring Prefetching Warp warp scheduling
title	WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T17%3A24%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=WASP:%20Selective%20Data%20Prefetching%20with%20Monitoring%20Runtime%20Warp%20Progress%20on%20GPUs&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Oh,%20Yunho&rft.date=2018-09-01&rft.volume=67&rft.issue=9&rft.spage=1366&rft.epage=1373&rft.pages=1366-1373&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2018.2813379&rft_dat=%3Cproquest_cross%3E2117122576%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c289t-3177343e1e9980e29451291d2143e4025a2b59b60808cbcee61664c0ada30013%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2117122576&rft_id=info:pmid/&rft_ieee_id=8309426&rfr_iscdi=true