Loading…
ArchiveDB—Scientific and technical data archive for Wendelstein 7-X
[Display omitted] •ArchiveDB archives all scientific and technical data of Wendelstein.•Primary index is the measured absolute time.•Continuously arising data is chunked in time for storage.•The Big Data Lambda Architecture pattern is applied.•The system is in place since a decade and a major change...
Saved in:
Published in: | Fusion engineering and design 2016-11, Vol.112, p.984-990 |
---|---|
Main Authors: | , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | [Display omitted]
•ArchiveDB archives all scientific and technical data of Wendelstein.•Primary index is the measured absolute time.•Continuously arising data is chunked in time for storage.•The Big Data Lambda Architecture pattern is applied.•The system is in place since a decade and a major change of underlying technology has been mastered.
ArchiveDB is the data archive for all scientific and technical data collected at the Wendelstein 7-X project. It is a distributed system allowing continuous data archival. ArchiveDB has demanding requirements regarding performance efficiency (storage performance of 30GB/s during experiments, expected storage amount of 1.4PB/year), reliability (availability of 364days/year), maintainability (testability) and portability (including change of hardware and software).
Data acquisition with continuous operation and high time resolutions (up to nanoseconds scale) for physics data is supported as well as long-term recording up to 24h/7days for operational data (∼1Hz rate). Moreover, all results of data analysis are stored in the archive. Another challenge, uniform retrieval of measured and analyzed data, allowing time and structure information as selection criteria, is mastered as well.
The key concepts of data storage and retrieval are: (1) partitioning of incoming data in groups and stream, (2) chunking of data in boxes of manageable size covering a finite time period, and (3) indexing of data using absolute time as ordering and indexing criteria.
Continuous operation of the ArchiveDB software and hardware for various systems and components relevant to Wendelstein 7-X has been done successfully for several years, thus, showing that the key requirements are satisfied. The overall data amount so far has reached 7 Terabyte over 9 years of data taking. Round-the-clock operation of the archive is in place since 5 years. Initial plasma operation OP1.1 of Wendelstein 7-X has been supported with no downtime during the whole experimental campaign.
The paper describes the software engineering concepts that have been used, consolidated and refined over the years of continuing productive ArchiveDB use and development. Changes in the underlying techniques, e.g. a change of the data store, have been encapsulated via an Application Programming Interface (API). This API unifies different implementations and is also suitable for data migration. |
---|---|
ISSN: | 0920-3796 1873-7196 |
DOI: | 10.1016/j.fusengdes.2016.05.026 |