Loading…
The Effect of Multicollinearity on Feature Selection
Objectives: To provide a new LASSO-based feature selection technique that aids in selecting important variables for predicting the response variables in case of multicollinearity. Methods: LASSO is a type of regression method employed to select important covariates for predicting a dependent variabl...
Saved in:
Published in: | Indian journal of science and technology 2024-09, Vol.17 (35), p.3664-3668 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Objectives: To provide a new LASSO-based feature selection technique that aids in selecting important variables for predicting the response variables in case of multicollinearity. Methods: LASSO is a type of regression method employed to select important covariates for predicting a dependent variable. The traditional LASSO method uses the conventional Ordinary Least Square (OLS) method for this purpose. The Use of the OLS based LASSO approach gives unreliable results if the data deviates from normality. Thus, this study recommends using, a Redescending M-estimator-based LASSO approach. The efficacy of this new method is checked against the ordinary LASSO method using a real dataset and also a simulation study with various levels of sample size (N=100,200,1000), different numbers of predictors (p=10,15,20), and varying degrees of correlation (ρ = 0.96, 0.98, 0.999). Findings: The usual OLS-based LASSO finds it difficult to select important variables when the independent variables are correlated. The Redescending M-estimator-based LASSO addresses at tackling the pitfalls faced by Conventional LASSO methodology. Among other things, the proposed method is far better than the old-fashioned LASSO since it helps to pick out significant factors more effectively, particularly in the presence of multicollinearity. Novelty: The conventional OLS-based LASSO approach selects a greater number of non-significant variables in the presence of multicollinearity. The proposed Redescending M-estimator-based LASSO approach selects the important variables in the presence of multicollinearity. Keywords: Feature Selection, LASSO, MDAE, VIF, Variable Selection |
---|---|
ISSN: | 0974-6846 0974-5645 |
DOI: | 10.17485/IJST/v17i35.1876 |