Loading…
An exponential-type kernel robust regression model for interval-valued variables
•The paper provides a robust regression method for interval-valued variables.•The computation of the sum of squares errors uses exponential-type kernel functions.•Outlier observations have a small weight for the parameter estimates.•Applications on synthetic and real data sets corroborate the propos...
Saved in:
Published in: | Information sciences 2018-07, Vol.454-455, p.419-442 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •The paper provides a robust regression method for interval-valued variables.•The computation of the sum of squares errors uses exponential-type kernel functions.•Outlier observations have a small weight for the parameter estimates.•Applications on synthetic and real data sets corroborate the proposed method.
The presence of outliers is very common in regression problems and the use of robust regression methods is strongly recommended such that the bad fitted observations not affect the parameter estimates of the model. Interval-valued variables are becoming common in data analysis problems since this type of data represents either the uncertainty existing in an error measurement or the natural variability present in the data. Regarding the presence of outliers in interval-valued data sets, few robust regression methods have been proposed in literature. This paper introduces a new robust regression method for interval-valued variables that penalizes the presence of outliers in the midpoints and/or in the ranges of interval-valued observations through the use of exponential-type kernel functions. Thus, the weight given to the midpoint and range of each interval-valued observation is updated at each iteration in order to optimize a suitable objective function. The convergence of the parameter estimation algorithm is guaranteed with a low computational cost. A comparative study between the proposed method against some previous robust regression approaches for interval-valued variables is also considered. The performance of these methods are evaluated based on the bias and mean squared error (MSE) of the parameter estimates for the midpoints and ranges of the intervals, considering synthetic data sets with X-space outliers, Y-space outliers and leverage points, different sample sizes and percentage of outliers in a Monte Carlo framework. The results suggest that the proposed approach presents a competitive performance (or best), in comparison with the previous approaches, on interval-valued outliers scenarios that are comparable to those found in practices. Applications to real interval-valued data sets corroborates the usefulness of the proposed method. |
---|---|
ISSN: | 0020-0255 1872-6291 |
DOI: | 10.1016/j.ins.2018.05.008 |