Loading…

On the structure of dynamic principal component analysis used in statistical process monitoring

When principal component analysis (PCA) is used for statistical process monitoring it relies on the assumption that data are time independent. However, industrial data will often exhibit serial correlation. Dynamic PCA (DPCA) has been suggested as a remedy for high-dimensional and time-dependent dat...

Full description

Saved in:
Bibliographic Details
Published in:Chemometrics and intelligent laboratory systems 2017-08, Vol.167, p.1-11
Main Authors: Vanhatalo, Erik, Kulahci, Murat, Bergquist, Bjarne
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c397t-6fee488a207500432a7c0d64d8d7c8b49e619ccfc8d1188c42624ff7b4ae139a3
cites cdi_FETCH-LOGICAL-c397t-6fee488a207500432a7c0d64d8d7c8b49e619ccfc8d1188c42624ff7b4ae139a3
container_end_page 11
container_issue
container_start_page 1
container_title Chemometrics and intelligent laboratory systems
container_volume 167
creator Vanhatalo, Erik
Kulahci, Murat
Bergquist, Bjarne
description When principal component analysis (PCA) is used for statistical process monitoring it relies on the assumption that data are time independent. However, industrial data will often exhibit serial correlation. Dynamic PCA (DPCA) has been suggested as a remedy for high-dimensional and time-dependent data. In DPCA the input matrix is augmented by adding time-lagged values of the variables. In building a DPCA model the analyst needs to decide on (1) the number of lags to add, and (2) given a specific lag structure, how many principal components to retain. In this article we propose a new analyst driven method to determine the maximum number of lags in DPCA with a foundation in multivariate time series analysis. The method is based on the behavior of the eigenvalues of the lagged autocorrelation and partial autocorrelation matrices. Given a specific lag structure we also propose a method for determining the number of principal components to retain. The number of retained principal components is determined by visual inspection of the serial correlation in the squared prediction error statistic, Q (SPE), together with the cumulative explained variance of the model. The methods are illustrated using simulated vector autoregressive and moving average data, and tested on Tennessee Eastman process data. •A new method to determine the number of lags in Dynamic PCA (DPCA) is proposed.•The proposed lag selection method applies multivariate time series theory.•A visual method to choose the number of PCs to retain is also proposed.•Simulated VAR(1) and VMA(1) data are used for tests and illustrations.•The methods perform well when tested on Tennessee Eastman Process data.
doi_str_mv 10.1016/j.chemolab.2017.05.016
format article
fullrecord <record><control><sourceid>elsevier_swepu</sourceid><recordid>TN_cdi_swepub_primary_oai_DiVA_org_ltu_63377</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0169743917300734</els_id><sourcerecordid>S0169743917300734</sourcerecordid><originalsourceid>FETCH-LOGICAL-c397t-6fee488a207500432a7c0d64d8d7c8b49e619ccfc8d1188c42624ff7b4ae139a3</originalsourceid><addsrcrecordid>eNqFkN1KxDAQhYMouP68guQBbE2abNPeKf6D4I16G7LTqWZpk5Kkyr69WVa99WqYw_kOM4eQM85Kznh9sS7hA0c_mFVZMa5KtiyzvEcWvFGiEJVo98kiK22hpGgPyVGMa7bdJV8Q_exo-kAaU5ghzQGp72m3cWa0QKdgHdjJDBT8OHmHLlHjzLCJNtI5Ykety6RJNiYL2TYFDxgjHb2zyWf6_YQc9GaIePozj8nr3e3L9UPx9Hz_eH31VIBoVSrqHlE2jamYWjImRWUUsK6WXdMpaFayxZq3AD00HedNA7KqK9n3aiUNctEacUzOd7nxC6d5pfPpowkb7Y3VN_btSvvwroc061oIpbK93tkh-BgD9n8AZ3rbql7r31b1tlXNljrLGbzcgZif-bQYdASLDrCzASHpztv_Ir4Bm1uG9Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>On the structure of dynamic principal component analysis used in statistical process monitoring</title><source>ScienceDirect Freedom Collection 2022-2024</source><creator>Vanhatalo, Erik ; Kulahci, Murat ; Bergquist, Bjarne</creator><creatorcontrib>Vanhatalo, Erik ; Kulahci, Murat ; Bergquist, Bjarne</creatorcontrib><description>When principal component analysis (PCA) is used for statistical process monitoring it relies on the assumption that data are time independent. However, industrial data will often exhibit serial correlation. Dynamic PCA (DPCA) has been suggested as a remedy for high-dimensional and time-dependent data. In DPCA the input matrix is augmented by adding time-lagged values of the variables. In building a DPCA model the analyst needs to decide on (1) the number of lags to add, and (2) given a specific lag structure, how many principal components to retain. In this article we propose a new analyst driven method to determine the maximum number of lags in DPCA with a foundation in multivariate time series analysis. The method is based on the behavior of the eigenvalues of the lagged autocorrelation and partial autocorrelation matrices. Given a specific lag structure we also propose a method for determining the number of principal components to retain. The number of retained principal components is determined by visual inspection of the serial correlation in the squared prediction error statistic, Q (SPE), together with the cumulative explained variance of the model. The methods are illustrated using simulated vector autoregressive and moving average data, and tested on Tennessee Eastman process data. •A new method to determine the number of lags in Dynamic PCA (DPCA) is proposed.•The proposed lag selection method applies multivariate time series theory.•A visual method to choose the number of PCs to retain is also proposed.•Simulated VAR(1) and VMA(1) data are used for tests and illustrations.•The methods perform well when tested on Tennessee Eastman Process data.</description><identifier>ISSN: 0169-7439</identifier><identifier>ISSN: 1873-3239</identifier><identifier>EISSN: 1873-3239</identifier><identifier>DOI: 10.1016/j.chemolab.2017.05.016</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Autocorrelation ; Dynamic principal component analysis ; Kvalitetsteknik ; Quality Technology and Management ; Simulation ; Tennessee Eastman process simulator ; Vector autoregressive process ; Vector moving average process</subject><ispartof>Chemometrics and intelligent laboratory systems, 2017-08, Vol.167, p.1-11</ispartof><rights>2017 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c397t-6fee488a207500432a7c0d64d8d7c8b49e619ccfc8d1188c42624ff7b4ae139a3</citedby><cites>FETCH-LOGICAL-c397t-6fee488a207500432a7c0d64d8d7c8b49e619ccfc8d1188c42624ff7b4ae139a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27923,27924</link.rule.ids><backlink>$$Uhttps://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-63377$$DView record from Swedish Publication Index$$Hfree_for_read</backlink></links><search><creatorcontrib>Vanhatalo, Erik</creatorcontrib><creatorcontrib>Kulahci, Murat</creatorcontrib><creatorcontrib>Bergquist, Bjarne</creatorcontrib><title>On the structure of dynamic principal component analysis used in statistical process monitoring</title><title>Chemometrics and intelligent laboratory systems</title><description>When principal component analysis (PCA) is used for statistical process monitoring it relies on the assumption that data are time independent. However, industrial data will often exhibit serial correlation. Dynamic PCA (DPCA) has been suggested as a remedy for high-dimensional and time-dependent data. In DPCA the input matrix is augmented by adding time-lagged values of the variables. In building a DPCA model the analyst needs to decide on (1) the number of lags to add, and (2) given a specific lag structure, how many principal components to retain. In this article we propose a new analyst driven method to determine the maximum number of lags in DPCA with a foundation in multivariate time series analysis. The method is based on the behavior of the eigenvalues of the lagged autocorrelation and partial autocorrelation matrices. Given a specific lag structure we also propose a method for determining the number of principal components to retain. The number of retained principal components is determined by visual inspection of the serial correlation in the squared prediction error statistic, Q (SPE), together with the cumulative explained variance of the model. The methods are illustrated using simulated vector autoregressive and moving average data, and tested on Tennessee Eastman process data. •A new method to determine the number of lags in Dynamic PCA (DPCA) is proposed.•The proposed lag selection method applies multivariate time series theory.•A visual method to choose the number of PCs to retain is also proposed.•Simulated VAR(1) and VMA(1) data are used for tests and illustrations.•The methods perform well when tested on Tennessee Eastman Process data.</description><subject>Autocorrelation</subject><subject>Dynamic principal component analysis</subject><subject>Kvalitetsteknik</subject><subject>Quality Technology and Management</subject><subject>Simulation</subject><subject>Tennessee Eastman process simulator</subject><subject>Vector autoregressive process</subject><subject>Vector moving average process</subject><issn>0169-7439</issn><issn>1873-3239</issn><issn>1873-3239</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNqFkN1KxDAQhYMouP68guQBbE2abNPeKf6D4I16G7LTqWZpk5Kkyr69WVa99WqYw_kOM4eQM85Kznh9sS7hA0c_mFVZMa5KtiyzvEcWvFGiEJVo98kiK22hpGgPyVGMa7bdJV8Q_exo-kAaU5ghzQGp72m3cWa0QKdgHdjJDBT8OHmHLlHjzLCJNtI5Ykety6RJNiYL2TYFDxgjHb2zyWf6_YQc9GaIePozj8nr3e3L9UPx9Hz_eH31VIBoVSrqHlE2jamYWjImRWUUsK6WXdMpaFayxZq3AD00HedNA7KqK9n3aiUNctEacUzOd7nxC6d5pfPpowkb7Y3VN_btSvvwroc061oIpbK93tkh-BgD9n8AZ3rbql7r31b1tlXNljrLGbzcgZif-bQYdASLDrCzASHpztv_Ir4Bm1uG9Q</recordid><startdate>20170815</startdate><enddate>20170815</enddate><creator>Vanhatalo, Erik</creator><creator>Kulahci, Murat</creator><creator>Bergquist, Bjarne</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ADTPV</scope><scope>AOWAS</scope></search><sort><creationdate>20170815</creationdate><title>On the structure of dynamic principal component analysis used in statistical process monitoring</title><author>Vanhatalo, Erik ; Kulahci, Murat ; Bergquist, Bjarne</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c397t-6fee488a207500432a7c0d64d8d7c8b49e619ccfc8d1188c42624ff7b4ae139a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Autocorrelation</topic><topic>Dynamic principal component analysis</topic><topic>Kvalitetsteknik</topic><topic>Quality Technology and Management</topic><topic>Simulation</topic><topic>Tennessee Eastman process simulator</topic><topic>Vector autoregressive process</topic><topic>Vector moving average process</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Vanhatalo, Erik</creatorcontrib><creatorcontrib>Kulahci, Murat</creatorcontrib><creatorcontrib>Bergquist, Bjarne</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><collection>SwePub</collection><collection>SwePub Articles</collection><jtitle>Chemometrics and intelligent laboratory systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vanhatalo, Erik</au><au>Kulahci, Murat</au><au>Bergquist, Bjarne</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>On the structure of dynamic principal component analysis used in statistical process monitoring</atitle><jtitle>Chemometrics and intelligent laboratory systems</jtitle><date>2017-08-15</date><risdate>2017</risdate><volume>167</volume><spage>1</spage><epage>11</epage><pages>1-11</pages><issn>0169-7439</issn><issn>1873-3239</issn><eissn>1873-3239</eissn><abstract>When principal component analysis (PCA) is used for statistical process monitoring it relies on the assumption that data are time independent. However, industrial data will often exhibit serial correlation. Dynamic PCA (DPCA) has been suggested as a remedy for high-dimensional and time-dependent data. In DPCA the input matrix is augmented by adding time-lagged values of the variables. In building a DPCA model the analyst needs to decide on (1) the number of lags to add, and (2) given a specific lag structure, how many principal components to retain. In this article we propose a new analyst driven method to determine the maximum number of lags in DPCA with a foundation in multivariate time series analysis. The method is based on the behavior of the eigenvalues of the lagged autocorrelation and partial autocorrelation matrices. Given a specific lag structure we also propose a method for determining the number of principal components to retain. The number of retained principal components is determined by visual inspection of the serial correlation in the squared prediction error statistic, Q (SPE), together with the cumulative explained variance of the model. The methods are illustrated using simulated vector autoregressive and moving average data, and tested on Tennessee Eastman process data. •A new method to determine the number of lags in Dynamic PCA (DPCA) is proposed.•The proposed lag selection method applies multivariate time series theory.•A visual method to choose the number of PCs to retain is also proposed.•Simulated VAR(1) and VMA(1) data are used for tests and illustrations.•The methods perform well when tested on Tennessee Eastman Process data.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.chemolab.2017.05.016</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0169-7439
ispartof Chemometrics and intelligent laboratory systems, 2017-08, Vol.167, p.1-11
issn 0169-7439
1873-3239
1873-3239
language eng
recordid cdi_swepub_primary_oai_DiVA_org_ltu_63377
source ScienceDirect Freedom Collection 2022-2024
subjects Autocorrelation
Dynamic principal component analysis
Kvalitetsteknik
Quality Technology and Management
Simulation
Tennessee Eastman process simulator
Vector autoregressive process
Vector moving average process
title On the structure of dynamic principal component analysis used in statistical process monitoring
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T20%3A53%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_swepu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=On%20the%20structure%20of%20dynamic%20principal%20component%20analysis%20used%20in%20statistical%20process%20monitoring&rft.jtitle=Chemometrics%20and%20intelligent%20laboratory%20systems&rft.au=Vanhatalo,%20Erik&rft.date=2017-08-15&rft.volume=167&rft.spage=1&rft.epage=11&rft.pages=1-11&rft.issn=0169-7439&rft.eissn=1873-3239&rft_id=info:doi/10.1016/j.chemolab.2017.05.016&rft_dat=%3Celsevier_swepu%3ES0169743917300734%3C/elsevier_swepu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c397t-6fee488a207500432a7c0d64d8d7c8b49e619ccfc8d1188c42624ff7b4ae139a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true