Loading…

Control infrastructure monitoring system at the NSRL facility cluster

The National Synchrotron Radiation Laboratory (NSRL) facility cluster is a collection of user facilities developed by NSRL, including the Hefei Light Source-II (HLS-II), Tunable Infrared Laser for Fundamental of Energy Chemistry (FELiChEM), and THz near-field high-flux material physical property tes...

Full description

Saved in:
Bibliographic Details
Published in:Journal of instrumentation 2022-11, Vol.17 (11), p.P11005
Main Authors: Qin, T., Li, C., Liu, G.F.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The National Synchrotron Radiation Laboratory (NSRL) facility cluster is a collection of user facilities developed by NSRL, including the Hefei Light Source-II (HLS-II), Tunable Infrared Laser for Fundamental of Energy Chemistry (FELiChEM), and THz near-field high-flux material physical property test system (NFTHZ). User facilities generally have high operational availability requirements. The NSRL facility cluster relies on the control infrastructure to provide computing, network, and storage resources, as well as various services that must be available 24 hours a day, 7 days a week. The monitoring system is responsible for tracking the operational status of the control infrastructure, gathering information on faults, performance degradation and cybersecurity issues, and distributing alarm messages in time. It facilitates the operator troubleshooting problems efficiently to improve the availability of the user facilities. The monitoring system is developed by integrating several free and open source software tools. Zabbix is selected as the monitoring tool and collects metrics data from the control infrastructure. Three upper-layer applications are developed for data visualization. The dashboard shows the operational status of the network and various devices. The alarm system collects and distributes alarm messages via web-based GUI and WeChat. The reporting system periodically generates metrics and alarm reports. The monitoring system has been deployed since March 2022. The results indicate that the monitoring system can effectively identify hidden hazards in the control infrastructure.
ISSN:1748-0221
1748-0221
DOI:10.1088/1748-0221/17/11/P11005