Loading…
Comparing and combining read miss clustering and software prefetching
A recent latency tolerance technique, read-miss clustering, restructures code to send demand-miss references in parallel to the underlying memory system. An alternative, widely-used latency tolerance technique is software prefetching, which initiates data fetches ahead of expected demand-miss refere...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | A recent latency tolerance technique, read-miss clustering, restructures code to send demand-miss references in parallel to the underlying memory system. An alternative, widely-used latency tolerance technique is software prefetching, which initiates data fetches ahead of expected demand-miss references by a certain distance. Since both techniques seem to target the same types of latencies and use the same system resources, it is unclear which technique is superior or if both can be combined. This paper shows that these two techniques are actually mutually beneficial, each helping to overcome limitations of the other: We perform our study for uniprocessor and multiprocessor configurations, in simulation and on a real machine (the Convex Exemplar). Compared to prefetching alone (the state-of-the-art implemented in systems today), the combination of the two techniques reduces the execution time by an average of 21% across all cases studied in simulation, and by an average of 16% for 5 out of 10 cases on the Exemplar. The combination sees execution time reductions relative to clustering alone averaging 15% for 6 out of 11 cases in simulation and 20% for 6 out of 10 cases on the Exemplar. |
---|---|
ISSN: | 1089-796X |
DOI: | 10.1109/PACT.2001.953310 |