Loading…

A novel variable selection approach based on co-linearity index to discover optimal process settings by analysing mixed data

•We propose a novel variable selection approach for a mixture of categorical and quantitative data.•We provide a mathematical formulation of the technique.•We discuss the data transformations necessary to juxtapose categorical and quantitative data.•We show how the technique can be used to assist in...

Full description

Saved in:
Bibliographic Details
Published in:Computers & industrial engineering 2014-06, Vol.72, p.217-229
Main Authors: Giannetti, C., Ransing, R.S., Ransing, M.R., Bould, D.C., Gethin, D.T., Sienz, J.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We propose a novel variable selection approach for a mixture of categorical and quantitative data.•We provide a mathematical formulation of the technique.•We discuss the data transformations necessary to juxtapose categorical and quantitative data.•We show how the technique can be used to assist in manufacturing diagnosis and root cause analysis. In the last two decades the application of statistical techniques to process control has gained popularity due to the widespread adoption of quality management systems such as ISO9001. Demonstration of continual process improvement by monitoring process effectiveness has become an integral part of satisfying the requirements of clause 8 of the ISO9001:2008 standard. The process effectiveness is measured in terms of one or more process responses. Data driven approaches are often used to associate the variability in process responses with one or more process variables. However, traditional techniques become unpractical in the presence of large number of variables and noisy data sets. This paper extends the co-linearity index and penalty matrix approach (Ransing et al., 2013) for discovering noise free correlations between heterogeneous process variables and responses. Noise is removed by reducing the dimensionality of the variable space and using robust data pre-treatment methods which are more suitable in the presence of outliers and skewed distributions for process variables. Scaling factors have been proposed to balance variance contributions from response variables, quantitative and categorical variables. The proposed method allows process variables with skewed distribution to contribute more to the variance than Gaussian distributed variables so that these variables can be investigated further, if necessary. Correlations are visualised in a single plot and can be used in real industrial settings to assist process engineers in manufacturing diagnosis and root cause analysis. The applicability and validity of this novel method has been demonstrated through two industrial case studies.
ISSN:0360-8352
1879-0550
DOI:10.1016/j.cie.2014.03.017