Loading…

A novel mixed frequency sampling discrete grey model for forecasting hard disk drive failure

The mixed data sampling (MIDAS) model has attracted increasing attention due to its outstanding performance in dealing with mixed frequency data. However, most MIDAS model extension studies are based on statistical methods or machine learning models, which suffer from insufficient prediction perform...

Full description

Saved in:
Bibliographic Details
Published in:ISA transactions 2024-04, Vol.147, p.304-327
Main Authors: Chen, Rongxing, Xiao, Xinping, Gao, Mingyun, Ding, Qi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c362t-339de4b87e6b7cd1bac26c68af117d83ea4da7615718d9a24273fedc771a0f13
cites cdi_FETCH-LOGICAL-c362t-339de4b87e6b7cd1bac26c68af117d83ea4da7615718d9a24273fedc771a0f13
container_end_page 327
container_issue
container_start_page 304
container_title ISA transactions
container_volume 147
creator Chen, Rongxing
Xiao, Xinping
Gao, Mingyun
Ding, Qi
description The mixed data sampling (MIDAS) model has attracted increasing attention due to its outstanding performance in dealing with mixed frequency data. However, most MIDAS model extension studies are based on statistical methods or machine learning models, which suffer from insufficient prediction performance and stability in small sample environments. To solve this problem, this paper proposes a novel mixed frequency sampling discrete grey model (MDGM(1, N)), which is a coupled form of the MIDAS model and discrete grey multivariate model. By adjusting the structure parameters, the model can be adapted to different sampling frequencies data, and degenerate into several types of grey models. Then, the unbiasedness and stability of the model are proved using the mathematical analysis method and numerical random experiment. The meta-heuristic algorithm is introduced to obtain the optimal weight parameters and the maximum lag order, improving the model's fitting ability to mixed frequency data. To demonstrate the effectiveness of the new model, a model evaluation system consisting of traditional evaluation metrics and a monotonicity test is established. Taking four hard disk drive failure datasets as research cases, the performance of the proposed model is compared with seven mainstream benchmark models. The results show that the proposed model has excellent applicability and outperforms other competition models in terms of validity, stability, and robustness. Furthermore, it is observed that the reported uncorrectable errors and the command timeout have a greater impact on hard disk drive failure. Finally, the new model is employed to forecast the failure of four hard disk drives. The forecasting results indicate that in the next four time points with a cycle of 21 days beginning in April 2023, the failure of the smaller capacity hard disk drives (0055 and 0086, corresponding to 8TB and 10TB) show a decreasing trend, reaching 67.442% and 89.7683%, respectively. The failure of the other larger capacity hard disk drives (0007 and 0138, corresponding to 12TB and 14TB) has increased, with a growth rate of 17.1016% and 123.7899%. [Display omitted] •A novel mixed frequency sampling discrete grey model is proposed.•The proposed model is proved to be unbiased and stable theoretically.•A Chimpanzee Optimization algorithm is introduced to globally search for weight parameters and maximum lag order.•A new model evaluation system for evaluating the effectiveness of prediction
doi_str_mv 10.1016/j.isatra.2024.02.023
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2954774816</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0019057824000867</els_id><sourcerecordid>2954774816</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-339de4b87e6b7cd1bac26c68af117d83ea4da7615718d9a24273fedc771a0f13</originalsourceid><addsrcrecordid>eNp9kN1LwzAQwIMobk7_A5E8-tKaj7ZJX4QhfsHAlz0KIU2uM7NdZ9IO99-bsumjcEfg-F3u7ofQNSUpJbS4W6cu6N7rlBGWpYTF4CdoSqUok1hip2hKCC0Tkgs5QRchrAkhLC_lOZpwmeU8F-UUvc_xpttBg1v3DRbXHr4G2Jg9DrrdNm6zwtYF46EHvPKwx21nI1x3fkwwOvQj86G9HcFPbL3bAa61awYPl-is1k2Aq-M7Q8unx-XDS7J4e359mC8SwwvWJ5yXFrJKCigqYSyttGGFKaSuKRVWctCZ1aKguaDSlpplTPAarBGCalJTPkO3h2-3vovbh161cWdoGr2BbgiKlXkmRCZpEdHsgBrfheChVlvvWu33ihI1alVrddCqRq2KsBg8tt0cJwxVC_av6ddjBO4PAMQzdw68CsZFj2BdtNQr27n_J_wA7T2MHw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2954774816</pqid></control><display><type>article</type><title>A novel mixed frequency sampling discrete grey model for forecasting hard disk drive failure</title><source>ScienceDirect Freedom Collection</source><creator>Chen, Rongxing ; Xiao, Xinping ; Gao, Mingyun ; Ding, Qi</creator><creatorcontrib>Chen, Rongxing ; Xiao, Xinping ; Gao, Mingyun ; Ding, Qi</creatorcontrib><description>The mixed data sampling (MIDAS) model has attracted increasing attention due to its outstanding performance in dealing with mixed frequency data. However, most MIDAS model extension studies are based on statistical methods or machine learning models, which suffer from insufficient prediction performance and stability in small sample environments. To solve this problem, this paper proposes a novel mixed frequency sampling discrete grey model (MDGM(1, N)), which is a coupled form of the MIDAS model and discrete grey multivariate model. By adjusting the structure parameters, the model can be adapted to different sampling frequencies data, and degenerate into several types of grey models. Then, the unbiasedness and stability of the model are proved using the mathematical analysis method and numerical random experiment. The meta-heuristic algorithm is introduced to obtain the optimal weight parameters and the maximum lag order, improving the model's fitting ability to mixed frequency data. To demonstrate the effectiveness of the new model, a model evaluation system consisting of traditional evaluation metrics and a monotonicity test is established. Taking four hard disk drive failure datasets as research cases, the performance of the proposed model is compared with seven mainstream benchmark models. The results show that the proposed model has excellent applicability and outperforms other competition models in terms of validity, stability, and robustness. Furthermore, it is observed that the reported uncorrectable errors and the command timeout have a greater impact on hard disk drive failure. Finally, the new model is employed to forecast the failure of four hard disk drives. The forecasting results indicate that in the next four time points with a cycle of 21 days beginning in April 2023, the failure of the smaller capacity hard disk drives (0055 and 0086, corresponding to 8TB and 10TB) show a decreasing trend, reaching 67.442% and 89.7683%, respectively. The failure of the other larger capacity hard disk drives (0007 and 0138, corresponding to 12TB and 14TB) has increased, with a growth rate of 17.1016% and 123.7899%. [Display omitted] •A novel mixed frequency sampling discrete grey model is proposed.•The proposed model is proved to be unbiased and stable theoretically.•A Chimpanzee Optimization algorithm is introduced to globally search for weight parameters and maximum lag order.•A new model evaluation system for evaluating the effectiveness of prediction models is constructed.•The proposed model outperforms the other seven benchmark models in four case studies.</description><identifier>ISSN: 0019-0578</identifier><identifier>EISSN: 1879-2022</identifier><identifier>DOI: 10.1016/j.isatra.2024.02.023</identifier><identifier>PMID: 38453579</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Chimp optimization algorithm ; Hard disk drive failure forecasting ; Mixed data sampling ; Mixed frequency sampling grey model</subject><ispartof>ISA transactions, 2024-04, Vol.147, p.304-327</ispartof><rights>2024 ISA</rights><rights>Copyright © 2024 ISA. Published by Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c362t-339de4b87e6b7cd1bac26c68af117d83ea4da7615718d9a24273fedc771a0f13</citedby><cites>FETCH-LOGICAL-c362t-339de4b87e6b7cd1bac26c68af117d83ea4da7615718d9a24273fedc771a0f13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38453579$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Rongxing</creatorcontrib><creatorcontrib>Xiao, Xinping</creatorcontrib><creatorcontrib>Gao, Mingyun</creatorcontrib><creatorcontrib>Ding, Qi</creatorcontrib><title>A novel mixed frequency sampling discrete grey model for forecasting hard disk drive failure</title><title>ISA transactions</title><addtitle>ISA Trans</addtitle><description>The mixed data sampling (MIDAS) model has attracted increasing attention due to its outstanding performance in dealing with mixed frequency data. However, most MIDAS model extension studies are based on statistical methods or machine learning models, which suffer from insufficient prediction performance and stability in small sample environments. To solve this problem, this paper proposes a novel mixed frequency sampling discrete grey model (MDGM(1, N)), which is a coupled form of the MIDAS model and discrete grey multivariate model. By adjusting the structure parameters, the model can be adapted to different sampling frequencies data, and degenerate into several types of grey models. Then, the unbiasedness and stability of the model are proved using the mathematical analysis method and numerical random experiment. The meta-heuristic algorithm is introduced to obtain the optimal weight parameters and the maximum lag order, improving the model's fitting ability to mixed frequency data. To demonstrate the effectiveness of the new model, a model evaluation system consisting of traditional evaluation metrics and a monotonicity test is established. Taking four hard disk drive failure datasets as research cases, the performance of the proposed model is compared with seven mainstream benchmark models. The results show that the proposed model has excellent applicability and outperforms other competition models in terms of validity, stability, and robustness. Furthermore, it is observed that the reported uncorrectable errors and the command timeout have a greater impact on hard disk drive failure. Finally, the new model is employed to forecast the failure of four hard disk drives. The forecasting results indicate that in the next four time points with a cycle of 21 days beginning in April 2023, the failure of the smaller capacity hard disk drives (0055 and 0086, corresponding to 8TB and 10TB) show a decreasing trend, reaching 67.442% and 89.7683%, respectively. The failure of the other larger capacity hard disk drives (0007 and 0138, corresponding to 12TB and 14TB) has increased, with a growth rate of 17.1016% and 123.7899%. [Display omitted] •A novel mixed frequency sampling discrete grey model is proposed.•The proposed model is proved to be unbiased and stable theoretically.•A Chimpanzee Optimization algorithm is introduced to globally search for weight parameters and maximum lag order.•A new model evaluation system for evaluating the effectiveness of prediction models is constructed.•The proposed model outperforms the other seven benchmark models in four case studies.</description><subject>Chimp optimization algorithm</subject><subject>Hard disk drive failure forecasting</subject><subject>Mixed data sampling</subject><subject>Mixed frequency sampling grey model</subject><issn>0019-0578</issn><issn>1879-2022</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kN1LwzAQwIMobk7_A5E8-tKaj7ZJX4QhfsHAlz0KIU2uM7NdZ9IO99-bsumjcEfg-F3u7ofQNSUpJbS4W6cu6N7rlBGWpYTF4CdoSqUok1hip2hKCC0Tkgs5QRchrAkhLC_lOZpwmeU8F-UUvc_xpttBg1v3DRbXHr4G2Jg9DrrdNm6zwtYF46EHvPKwx21nI1x3fkwwOvQj86G9HcFPbL3bAa61awYPl-is1k2Aq-M7Q8unx-XDS7J4e359mC8SwwvWJ5yXFrJKCigqYSyttGGFKaSuKRVWctCZ1aKguaDSlpplTPAarBGCalJTPkO3h2-3vovbh161cWdoGr2BbgiKlXkmRCZpEdHsgBrfheChVlvvWu33ihI1alVrddCqRq2KsBg8tt0cJwxVC_av6ddjBO4PAMQzdw68CsZFj2BdtNQr27n_J_wA7T2MHw</recordid><startdate>202404</startdate><enddate>202404</enddate><creator>Chen, Rongxing</creator><creator>Xiao, Xinping</creator><creator>Gao, Mingyun</creator><creator>Ding, Qi</creator><general>Elsevier Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>202404</creationdate><title>A novel mixed frequency sampling discrete grey model for forecasting hard disk drive failure</title><author>Chen, Rongxing ; Xiao, Xinping ; Gao, Mingyun ; Ding, Qi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-339de4b87e6b7cd1bac26c68af117d83ea4da7615718d9a24273fedc771a0f13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Chimp optimization algorithm</topic><topic>Hard disk drive failure forecasting</topic><topic>Mixed data sampling</topic><topic>Mixed frequency sampling grey model</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Rongxing</creatorcontrib><creatorcontrib>Xiao, Xinping</creatorcontrib><creatorcontrib>Gao, Mingyun</creatorcontrib><creatorcontrib>Ding, Qi</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>ISA transactions</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Rongxing</au><au>Xiao, Xinping</au><au>Gao, Mingyun</au><au>Ding, Qi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A novel mixed frequency sampling discrete grey model for forecasting hard disk drive failure</atitle><jtitle>ISA transactions</jtitle><addtitle>ISA Trans</addtitle><date>2024-04</date><risdate>2024</risdate><volume>147</volume><spage>304</spage><epage>327</epage><pages>304-327</pages><issn>0019-0578</issn><eissn>1879-2022</eissn><abstract>The mixed data sampling (MIDAS) model has attracted increasing attention due to its outstanding performance in dealing with mixed frequency data. However, most MIDAS model extension studies are based on statistical methods or machine learning models, which suffer from insufficient prediction performance and stability in small sample environments. To solve this problem, this paper proposes a novel mixed frequency sampling discrete grey model (MDGM(1, N)), which is a coupled form of the MIDAS model and discrete grey multivariate model. By adjusting the structure parameters, the model can be adapted to different sampling frequencies data, and degenerate into several types of grey models. Then, the unbiasedness and stability of the model are proved using the mathematical analysis method and numerical random experiment. The meta-heuristic algorithm is introduced to obtain the optimal weight parameters and the maximum lag order, improving the model's fitting ability to mixed frequency data. To demonstrate the effectiveness of the new model, a model evaluation system consisting of traditional evaluation metrics and a monotonicity test is established. Taking four hard disk drive failure datasets as research cases, the performance of the proposed model is compared with seven mainstream benchmark models. The results show that the proposed model has excellent applicability and outperforms other competition models in terms of validity, stability, and robustness. Furthermore, it is observed that the reported uncorrectable errors and the command timeout have a greater impact on hard disk drive failure. Finally, the new model is employed to forecast the failure of four hard disk drives. The forecasting results indicate that in the next four time points with a cycle of 21 days beginning in April 2023, the failure of the smaller capacity hard disk drives (0055 and 0086, corresponding to 8TB and 10TB) show a decreasing trend, reaching 67.442% and 89.7683%, respectively. The failure of the other larger capacity hard disk drives (0007 and 0138, corresponding to 12TB and 14TB) has increased, with a growth rate of 17.1016% and 123.7899%. [Display omitted] •A novel mixed frequency sampling discrete grey model is proposed.•The proposed model is proved to be unbiased and stable theoretically.•A Chimpanzee Optimization algorithm is introduced to globally search for weight parameters and maximum lag order.•A new model evaluation system for evaluating the effectiveness of prediction models is constructed.•The proposed model outperforms the other seven benchmark models in four case studies.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>38453579</pmid><doi>10.1016/j.isatra.2024.02.023</doi><tpages>24</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0019-0578
ispartof ISA transactions, 2024-04, Vol.147, p.304-327
issn 0019-0578
1879-2022
language eng
recordid cdi_proquest_miscellaneous_2954774816
source ScienceDirect Freedom Collection
subjects Chimp optimization algorithm
Hard disk drive failure forecasting
Mixed data sampling
Mixed frequency sampling grey model
title A novel mixed frequency sampling discrete grey model for forecasting hard disk drive failure
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T18%3A13%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20novel%20mixed%20frequency%20sampling%20discrete%20grey%20model%20for%20forecasting%20hard%20disk%20drive%20failure&rft.jtitle=ISA%20transactions&rft.au=Chen,%20Rongxing&rft.date=2024-04&rft.volume=147&rft.spage=304&rft.epage=327&rft.pages=304-327&rft.issn=0019-0578&rft.eissn=1879-2022&rft_id=info:doi/10.1016/j.isatra.2024.02.023&rft_dat=%3Cproquest_cross%3E2954774816%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c362t-339de4b87e6b7cd1bac26c68af117d83ea4da7615718d9a24273fedc771a0f13%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2954774816&rft_id=info:pmid/38453579&rfr_iscdi=true