Loading…

A caching mechanism to exploit object store speed in High Energy Physics analysis

Data analysis workflows in High Energy Physics (HEP) read data written in the ROOT columnar format. Such data has traditionally been stored in files that are often read via the network from remote storage facilities, which represents a performance penalty especially for data processing workflows tha...

Full description

Saved in:
Bibliographic Details
Published in:Cluster computing 2023-10, Vol.26 (5), p.2757-2772
Main Authors: Padulano, Vincenzo Eduardo, Tejedor Saavedra, Enric, Alonso-Jordá, Pedro, López Gómez, Javier, Blomer, Jakob
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data analysis workflows in High Energy Physics (HEP) read data written in the ROOT columnar format. Such data has traditionally been stored in files that are often read via the network from remote storage facilities, which represents a performance penalty especially for data processing workflows that are I/O bound. To address that issue, this paper presents a new caching mechanism, implemented in the I/O subsystem of ROOT, which is independent of the storage backend used to write the dataset. Notably, it can be used to leverage the speed of high-bandwidth, low-latency object stores. The performance of this caching approach is evaluated by running a real physics analysis on an Intel DAOS cluster, both on a single node and distributed on multiple nodes.
ISSN:1386-7857
1573-7543
DOI:10.1007/s10586-022-03757-2