Loading…

rta-dq-lib: a software library to perform online data quality analysis of scientific data

The Cherenkov Telescope Array (CTA) is an initiative that is currently building the largest gamma-ray ground Observatory that ever existed. A Science Alert Generation (SAG) system, part of the Array Control and Data Acquisition (ACADA) system of the CTA Observatory, analyses online the telescope dat...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2021-05
Main Authors: Baroncelli, Leonardo, Bulgarelli, Andrea, Parmiggiani, Nicolo, Fioretti, Valentina, Addis, Antonio, De Cesare, Giovanni, Ambra Di Piano, Conforti, Vito, Gianotti, Fulvio, Russo, Federico, Maurin, Gilles, Vuillaume, Thomas, Aubert, Pierre, Garcia, Emilio, Zoccoli, Antonio
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The Cherenkov Telescope Array (CTA) is an initiative that is currently building the largest gamma-ray ground Observatory that ever existed. A Science Alert Generation (SAG) system, part of the Array Control and Data Acquisition (ACADA) system of the CTA Observatory, analyses online the telescope data - arriving at an event rate of tens of kHz - to detect transient gamma-ray events. The SAG system also performs an online data quality analysis to assess the instruments' health during the data acquisition: this analysis is crucial to confirm good detections. A Python and a C++ software library to perform the online data quality analysis of CTA data, called rta-dq-lib, has been proposed for CTA. The Python version is dedicated to the rapid prototyping of data quality use cases. The C++ version is optimized for maximum performance. The library allows the user to define, through XML configuration files, the format of the input data and, for each data field, which quality checks must be performed and which types of aggregations and transformations must be applied. It internally translates the XML configuration into a direct acyclic computational graph that encodes the dependencies of the computational tasks to be performed. This model allows the library to easily take advantage of parallelization at the thread level and the overall flexibility allow us to develop generic data quality analysis pipelines that could also be reused in other applications.
ISSN:2331-8422