Loading…
DR DRAM: Accelerating Memory-Read-Intensive Applications
Today, many data analytic workloads such as graph processing and neural network desire efficient memory read operation. The need for preprocessing various raw data also demands enhanced memory read bandwidth. Unfortunately, due to the necessity of dynamic refresh, modern DRAM system has to stall mem...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 309 |
container_issue | |
container_start_page | 301 |
container_title | |
container_volume | |
creator | Cao, Yuhai Li, Chao Chen, Quan Leng, Jingwen Guo, Minyi Wang, Jing Zhang, Weigong |
description | Today, many data analytic workloads such as graph processing and neural network desire efficient memory read operation. The need for preprocessing various raw data also demands enhanced memory read bandwidth. Unfortunately, due to the necessity of dynamic refresh, modern DRAM system has to stall memory access during each refresh cycle. As DRAM device density continues to grow, the refresh time also needs to extend to cover more memory rows. Consequently, DRAM refresh operation can be a crucial throughput bottleneck for memory read intensive (MRI) data processing tasks. To fully unleash the performance of these applications, we revisit conventional DRAM architecture and refresh mechanism. We propose DR DRAM, an application-specific memory design approach that makes a novel tradeoff between read and write performance. Simply put, DR has two layers of meaning: device refresh and data recovery. It aims at eliminating stall by enabling read and refresh operations to be done simultaneously. Unlike traditional schemes, DR explores device refresh that only refreshes a specific device at a time. Meanwhile, DR increases read efficiency by recovering the inaccessible data that resides on a device under refreshing. Our design can be implemented on existing redundant data storage area on DRAM. In this paper we detail DR's architecture and protocol design. We evaluate it on a cycle accurate simulator. Our results show that DR can nearly eliminate refresh overhead for memory read operation and brings up to 12% extra maximum read bandwidth and 50~60% latency improvement on present DRR4 device. |
doi_str_mv | 10.1109/ICCD.2018.00053 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_8615703</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8615703</ieee_id><sourcerecordid>8615703</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-dfcecb4cd010c3b6a03b4cbd76f6b3ba83bc444b3d6b2b64f2ef784279627c333</originalsourceid><addsrcrecordid>eNotjFFLwzAURqMgOOeeffClfyAzyU1vUt9K67SwIRR9Hkl6K5GuK20R9u8t6NPH4Rw-xh6k2EopsqeqKMqtEtJuhRApXLE7mYJFq41R12ylUoMcswxv2WaavpdGLU4asWK2rJOyzg_PSR4CdTS6OfZfyYFO5_HCa3INr_qZ-in-UJIPQxfDUpz76Z7dtK6baPO_a_a5e_ko3vj-_bUq8j2P0qQzb9pAwevQCCkCeHQCFvKNwRY9eGfBB621hwa98qhbRa2xWpkMlQkAsGaPf7-RiI7DGE9uvBwtytQIgF9K90Xo</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>DR DRAM: Accelerating Memory-Read-Intensive Applications</title><source>IEEE Xplore All Conference Series</source><creator>Cao, Yuhai ; Li, Chao ; Chen, Quan ; Leng, Jingwen ; Guo, Minyi ; Wang, Jing ; Zhang, Weigong</creator><creatorcontrib>Cao, Yuhai ; Li, Chao ; Chen, Quan ; Leng, Jingwen ; Guo, Minyi ; Wang, Jing ; Zhang, Weigong</creatorcontrib><description>Today, many data analytic workloads such as graph processing and neural network desire efficient memory read operation. The need for preprocessing various raw data also demands enhanced memory read bandwidth. Unfortunately, due to the necessity of dynamic refresh, modern DRAM system has to stall memory access during each refresh cycle. As DRAM device density continues to grow, the refresh time also needs to extend to cover more memory rows. Consequently, DRAM refresh operation can be a crucial throughput bottleneck for memory read intensive (MRI) data processing tasks. To fully unleash the performance of these applications, we revisit conventional DRAM architecture and refresh mechanism. We propose DR DRAM, an application-specific memory design approach that makes a novel tradeoff between read and write performance. Simply put, DR has two layers of meaning: device refresh and data recovery. It aims at eliminating stall by enabling read and refresh operations to be done simultaneously. Unlike traditional schemes, DR explores device refresh that only refreshes a specific device at a time. Meanwhile, DR increases read efficiency by recovering the inaccessible data that resides on a device under refreshing. Our design can be implemented on existing redundant data storage area on DRAM. In this paper we detail DR's architecture and protocol design. We evaluate it on a cycle accurate simulator. Our results show that DR can nearly eliminate refresh overhead for memory read operation and brings up to 12% extra maximum read bandwidth and 50~60% latency improvement on present DRR4 device.</description><identifier>EISSN: 2576-6996</identifier><identifier>EISBN: 1538684772</identifier><identifier>EISBN: 9781538684771</identifier><identifier>DOI: 10.1109/ICCD.2018.00053</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bandwidth ; data analysis ; DRAM refresh ; Magnetic resonance imaging ; memory bandwidth ; Memory management ; Performance evaluation ; Random access memory ; read-intensive applications ; redundant data storage ; Task analysis</subject><ispartof>2018 IEEE 36th International Conference on Computer Design (ICCD), 2018, p.301-309</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8615703$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,23929,23930,25139,27924,54554,54931</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8615703$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Cao, Yuhai</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Chen, Quan</creatorcontrib><creatorcontrib>Leng, Jingwen</creatorcontrib><creatorcontrib>Guo, Minyi</creatorcontrib><creatorcontrib>Wang, Jing</creatorcontrib><creatorcontrib>Zhang, Weigong</creatorcontrib><title>DR DRAM: Accelerating Memory-Read-Intensive Applications</title><title>2018 IEEE 36th International Conference on Computer Design (ICCD)</title><addtitle>ICCD</addtitle><description>Today, many data analytic workloads such as graph processing and neural network desire efficient memory read operation. The need for preprocessing various raw data also demands enhanced memory read bandwidth. Unfortunately, due to the necessity of dynamic refresh, modern DRAM system has to stall memory access during each refresh cycle. As DRAM device density continues to grow, the refresh time also needs to extend to cover more memory rows. Consequently, DRAM refresh operation can be a crucial throughput bottleneck for memory read intensive (MRI) data processing tasks. To fully unleash the performance of these applications, we revisit conventional DRAM architecture and refresh mechanism. We propose DR DRAM, an application-specific memory design approach that makes a novel tradeoff between read and write performance. Simply put, DR has two layers of meaning: device refresh and data recovery. It aims at eliminating stall by enabling read and refresh operations to be done simultaneously. Unlike traditional schemes, DR explores device refresh that only refreshes a specific device at a time. Meanwhile, DR increases read efficiency by recovering the inaccessible data that resides on a device under refreshing. Our design can be implemented on existing redundant data storage area on DRAM. In this paper we detail DR's architecture and protocol design. We evaluate it on a cycle accurate simulator. Our results show that DR can nearly eliminate refresh overhead for memory read operation and brings up to 12% extra maximum read bandwidth and 50~60% latency improvement on present DRR4 device.</description><subject>Bandwidth</subject><subject>data analysis</subject><subject>DRAM refresh</subject><subject>Magnetic resonance imaging</subject><subject>memory bandwidth</subject><subject>Memory management</subject><subject>Performance evaluation</subject><subject>Random access memory</subject><subject>read-intensive applications</subject><subject>redundant data storage</subject><subject>Task analysis</subject><issn>2576-6996</issn><isbn>1538684772</isbn><isbn>9781538684771</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2018</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjFFLwzAURqMgOOeeffClfyAzyU1vUt9K67SwIRR9Hkl6K5GuK20R9u8t6NPH4Rw-xh6k2EopsqeqKMqtEtJuhRApXLE7mYJFq41R12ylUoMcswxv2WaavpdGLU4asWK2rJOyzg_PSR4CdTS6OfZfyYFO5_HCa3INr_qZ-in-UJIPQxfDUpz76Z7dtK6baPO_a_a5e_ko3vj-_bUq8j2P0qQzb9pAwevQCCkCeHQCFvKNwRY9eGfBB621hwa98qhbRa2xWpkMlQkAsGaPf7-RiI7DGE9uvBwtytQIgF9K90Xo</recordid><startdate>201810</startdate><enddate>201810</enddate><creator>Cao, Yuhai</creator><creator>Li, Chao</creator><creator>Chen, Quan</creator><creator>Leng, Jingwen</creator><creator>Guo, Minyi</creator><creator>Wang, Jing</creator><creator>Zhang, Weigong</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201810</creationdate><title>DR DRAM: Accelerating Memory-Read-Intensive Applications</title><author>Cao, Yuhai ; Li, Chao ; Chen, Quan ; Leng, Jingwen ; Guo, Minyi ; Wang, Jing ; Zhang, Weigong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-dfcecb4cd010c3b6a03b4cbd76f6b3ba83bc444b3d6b2b64f2ef784279627c333</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bandwidth</topic><topic>data analysis</topic><topic>DRAM refresh</topic><topic>Magnetic resonance imaging</topic><topic>memory bandwidth</topic><topic>Memory management</topic><topic>Performance evaluation</topic><topic>Random access memory</topic><topic>read-intensive applications</topic><topic>redundant data storage</topic><topic>Task analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Cao, Yuhai</creatorcontrib><creatorcontrib>Li, Chao</creatorcontrib><creatorcontrib>Chen, Quan</creatorcontrib><creatorcontrib>Leng, Jingwen</creatorcontrib><creatorcontrib>Guo, Minyi</creatorcontrib><creatorcontrib>Wang, Jing</creatorcontrib><creatorcontrib>Zhang, Weigong</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Cao, Yuhai</au><au>Li, Chao</au><au>Chen, Quan</au><au>Leng, Jingwen</au><au>Guo, Minyi</au><au>Wang, Jing</au><au>Zhang, Weigong</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>DR DRAM: Accelerating Memory-Read-Intensive Applications</atitle><btitle>2018 IEEE 36th International Conference on Computer Design (ICCD)</btitle><stitle>ICCD</stitle><date>2018-10</date><risdate>2018</risdate><spage>301</spage><epage>309</epage><pages>301-309</pages><eissn>2576-6996</eissn><eisbn>1538684772</eisbn><eisbn>9781538684771</eisbn><coden>IEEPAD</coden><abstract>Today, many data analytic workloads such as graph processing and neural network desire efficient memory read operation. The need for preprocessing various raw data also demands enhanced memory read bandwidth. Unfortunately, due to the necessity of dynamic refresh, modern DRAM system has to stall memory access during each refresh cycle. As DRAM device density continues to grow, the refresh time also needs to extend to cover more memory rows. Consequently, DRAM refresh operation can be a crucial throughput bottleneck for memory read intensive (MRI) data processing tasks. To fully unleash the performance of these applications, we revisit conventional DRAM architecture and refresh mechanism. We propose DR DRAM, an application-specific memory design approach that makes a novel tradeoff between read and write performance. Simply put, DR has two layers of meaning: device refresh and data recovery. It aims at eliminating stall by enabling read and refresh operations to be done simultaneously. Unlike traditional schemes, DR explores device refresh that only refreshes a specific device at a time. Meanwhile, DR increases read efficiency by recovering the inaccessible data that resides on a device under refreshing. Our design can be implemented on existing redundant data storage area on DRAM. In this paper we detail DR's architecture and protocol design. We evaluate it on a cycle accurate simulator. Our results show that DR can nearly eliminate refresh overhead for memory read operation and brings up to 12% extra maximum read bandwidth and 50~60% latency improvement on present DRR4 device.</abstract><pub>IEEE</pub><doi>10.1109/ICCD.2018.00053</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | EISSN: 2576-6996 |
ispartof | 2018 IEEE 36th International Conference on Computer Design (ICCD), 2018, p.301-309 |
issn | 2576-6996 |
language | eng |
recordid | cdi_ieee_primary_8615703 |
source | IEEE Xplore All Conference Series |
subjects | Bandwidth data analysis DRAM refresh Magnetic resonance imaging memory bandwidth Memory management Performance evaluation Random access memory read-intensive applications redundant data storage Task analysis |
title | DR DRAM: Accelerating Memory-Read-Intensive Applications |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T03%3A20%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=DR%20DRAM:%20Accelerating%20Memory-Read-Intensive%20Applications&rft.btitle=2018%20IEEE%2036th%20International%20Conference%20on%20Computer%20Design%20(ICCD)&rft.au=Cao,%20Yuhai&rft.date=2018-10&rft.spage=301&rft.epage=309&rft.pages=301-309&rft.eissn=2576-6996&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ICCD.2018.00053&rft.eisbn=1538684772&rft.eisbn_list=9781538684771&rft_dat=%3Cieee_CHZPO%3E8615703%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i175t-dfcecb4cd010c3b6a03b4cbd76f6b3ba83bc444b3d6b2b64f2ef784279627c333%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8615703&rfr_iscdi=true |