Loading…

Versioning Architectures for Local and Global Memory

Future supercomputer systems will face serious reliability challenges. Among failure scenarios, latent errors are some of the most serious and concerning. Preserving multiple versions of critical data is a promising approach to deal with such errors. We are developing the Global View Resilience (GVR...

Full description

Saved in:
Bibliographic Details
Main Authors: Fujita, Hajime, Iskra, Kamil, Balaji, Pavan, Chien, Andrew A.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Future supercomputer systems will face serious reliability challenges. Among failure scenarios, latent errors are some of the most serious and concerning. Preserving multiple versions of critical data is a promising approach to deal with such errors. We are developing the Global View Resilience (GVR) library, with multi-version global arrays as one of the key features. This paper presents three array versioning architectures: flat array, flat array with change tracking, and log-structured array. We use a synthetic workload that mimics the memory access patterns of radix sort, N-body simulation, and matrix multiplication, comparing the three array architectures in terms of runtime performance, memory requirements, and version restoration costs. The experiments show that the flat array with change tracking is the best architecture in terms of runtime performance, for versioning frequencies of 10 -5 ops -1 or higher matching the second best architecture or beating it by up to 23 times, whereas the log-structured array is preferable for low memory usage, since it saves up to 98% of memory compared with a flat array.
ISSN:2690-5965
1521-9097
DOI:10.1109/ICPADS.2015.71