Loading…

Novel mixed integer optimization sparse regression approach in chemometrics

Sparse mathematical modelling plays an increasingly important role in chemometrics due to its interpretability and prediction power. While many sparse techniques used in chemometrics rely on L1 penalization to create sparser models, Mixed Integer Optimization (MIO) achieves sparsity by imposing cons...

Full description

Saved in:
Bibliographic Details
Published in:Analytica chimica acta 2020-11, Vol.1137, p.115-124
Main Authors: Bertsimas, D., Lahlou Kitane, D., Azami, N., Doucet, F.R.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sparse mathematical modelling plays an increasingly important role in chemometrics due to its interpretability and prediction power. While many sparse techniques used in chemometrics rely on L1 penalization to create sparser models, Mixed Integer Optimization (MIO) achieves sparsity by imposing constraints directly in the model. In this paper, we develop an intuitive and flexible robust sparse regression framework using MIO. We use constraints and penalization to achieve sparsity and robustness respectively. We test and compare results with those obtained using other techniques generating sparser models such as LASSO and sparse PLS. We also use PLS as a baseline to compare predictive performance. We use a LIBS data set of certified reference materials (CRM) of various mineral ores to illustrate the framework using different objective functions. The MIO framework proposed improves accuracy, sparsity and robustness vs. LASSO and SPLS. MIO achieves an average R2 higher than other methods on average by at least 10.6%. Robust MIO approach also improves interpretability. It also uses 4.3 variables on average while LASSO and SPLS use 16.1 and 805.8 respectively. We also illustrate how interpretability can help build better models through examples derived from the data sets used. When adding noise to the signal, MIO achieves an R2 of 0.69 on average when all models have negative R2 values. The MIO framework proposed is versatile and could be used in other chemometrics applications. [Display omitted] •Novel mixed integer calibration method.•Best Subset Selection.•Benchmark with existing methods.•Tests on mineral ore.
ISSN:0003-2670
1873-4324
DOI:10.1016/j.aca.2020.08.054