Loading…

I/O Deduplication: Utilizing content similarity to improve I/O performance

Duplication of data in storage systems is becoming increasingly common. We introduce I/O Deduplication, a storage optimization that utilizes content similarity for improving I/O performance by eliminating I/O operations and reducing the mechanical delays during I/O operations. I/O Deduplication cons...

Full description

Saved in:
Bibliographic Details
Published in:ACM transactions on storage 2010-09, Vol.6 (3)
Main Authors: Koller, Ricardo, Rangaswami, Raju
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Duplication of data in storage systems is becoming increasingly common. We introduce I/O Deduplication, a storage optimization that utilizes content similarity for improving I/O performance by eliminating I/O operations and reducing the mechanical delays during I/O operations. I/O Deduplication consists of three main techniques: content-based caching, dynamic replica retrieval, and selective duplication. Each of these techniques is motivated by our observations with I/O workload traces obtained from actively-used production storage systems, all of which revealed surprisingly high levels of content similarity for both stored and accessed data. Evaluation of a prototype implementation using these workloads showed an overall improvement in disk I/O performance of 28 to 47% across these workloads. Further breakdown also showed that each of the three techniques contributed significantly to the overall performance improvement.
ISSN:1553-3077
DOI:10.1145/201045.201046