Loading…

A novel mixed frequency sampling discrete grey model for forecasting hard disk drive failure

The mixed data sampling (MIDAS) model has attracted increasing attention due to its outstanding performance in dealing with mixed frequency data. However, most MIDAS model extension studies are based on statistical methods or machine learning models, which suffer from insufficient prediction perform...

Full description

Saved in:
Bibliographic Details
Published in:ISA transactions 2024-04, Vol.147, p.304-327
Main Authors: Chen, Rongxing, Xiao, Xinping, Gao, Mingyun, Ding, Qi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The mixed data sampling (MIDAS) model has attracted increasing attention due to its outstanding performance in dealing with mixed frequency data. However, most MIDAS model extension studies are based on statistical methods or machine learning models, which suffer from insufficient prediction performance and stability in small sample environments. To solve this problem, this paper proposes a novel mixed frequency sampling discrete grey model (MDGM(1, N)), which is a coupled form of the MIDAS model and discrete grey multivariate model. By adjusting the structure parameters, the model can be adapted to different sampling frequencies data, and degenerate into several types of grey models. Then, the unbiasedness and stability of the model are proved using the mathematical analysis method and numerical random experiment. The meta-heuristic algorithm is introduced to obtain the optimal weight parameters and the maximum lag order, improving the model's fitting ability to mixed frequency data. To demonstrate the effectiveness of the new model, a model evaluation system consisting of traditional evaluation metrics and a monotonicity test is established. Taking four hard disk drive failure datasets as research cases, the performance of the proposed model is compared with seven mainstream benchmark models. The results show that the proposed model has excellent applicability and outperforms other competition models in terms of validity, stability, and robustness. Furthermore, it is observed that the reported uncorrectable errors and the command timeout have a greater impact on hard disk drive failure. Finally, the new model is employed to forecast the failure of four hard disk drives. The forecasting results indicate that in the next four time points with a cycle of 21 days beginning in April 2023, the failure of the smaller capacity hard disk drives (0055 and 0086, corresponding to 8TB and 10TB) show a decreasing trend, reaching 67.442% and 89.7683%, respectively. The failure of the other larger capacity hard disk drives (0007 and 0138, corresponding to 12TB and 14TB) has increased, with a growth rate of 17.1016% and 123.7899%. [Display omitted] •A novel mixed frequency sampling discrete grey model is proposed.•The proposed model is proved to be unbiased and stable theoretically.•A Chimpanzee Optimization algorithm is introduced to globally search for weight parameters and maximum lag order.•A new model evaluation system for evaluating the effectiveness of prediction
ISSN:0019-0578
1879-2022
DOI:10.1016/j.isatra.2024.02.023