Loading…

Time-Efficient Algorithms for Robust Estimators of Location, Scale, Symmetry, and Tail heaviness

The analysis of the empirical distribution of univariate data often includes the computation of location, scale, skewness, and tail-heaviness measures, which are estimates of specific parameters of the underlying population distribution. Several measures are available, but they differ by Gaussian ef...

Full description

Saved in:
Bibliographic Details
Published in:The Stata journal 2015-04, Vol.15 (1), p.77-94
Main Authors: Gelade, Wouter, Verardi, Vincenzo, Vermandele, Catherine
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The analysis of the empirical distribution of univariate data often includes the computation of location, scale, skewness, and tail-heaviness measures, which are estimates of specific parameters of the underlying population distribution. Several measures are available, but they differ by Gaussian efficiency, robustness regarding outliers, and meaning in the case of asymmetric distributions. In this article, we briefly compare, for each type of parameter (location, scale, skewness, and tail heaviness), the “classical” estimator based on (centered) moments of the empirical distribution, an estimator based on specific quantiles of the distribution, and an estimator based on pairwise comparisons of the observations. This last one always performs better than the other estimators, particularly in terms of robustness, but it requires a heavy computation time of an order of n 2 . Fortunately, as explained in Croux and Rousseeuw (1992, Computational Statistics 1: 411–428), the algorithm of Johnson and Mizoguchi (1978, SIAM Journal of Scientific Computing 7: 147–153) allows one to substantially reduce the computation time to an order of n log n and, hence, allows the use of robust estimators based on pairwise comparisons, even in very large datasets. This has motivated us to program this algorithm for Stata. In this article, we describe the algorithm and the associated commands. We also illustrate the computation of these robust estimators by involving them in a normality test of Jarque–Bera form (Jarque and Bera 1980, Economics Letters 6: 255–259; Brys, Hubert, and Struyf, 2008, Computational Statistics 23: 429–442) using real data.
ISSN:1536-867X
1536-8734
DOI:10.1177/1536867X1501500105