Loading…

Relative error-based distributed estimation in growing dimensions

This paper studies the estimation problem of multiplicative models for large-scale positive response data with growing dimensions. First, we propose a communication-efficient least product relative error estimator which is the minimizer of a surrogate loss function that approximates the global least...

Full description

Saved in:

Bibliographic Details
Published in:	Applied mathematical modelling 2024-11, Vol.135, p.601-619
Main Authors:	Li, Xiaoyan, Xia, Xiaochao, Zhang, Zhimin
Format:	Article
Language:	English
Subjects:	Growing dimensions Large-scale positive response data Least product relative error Multiplicative models
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper studies the estimation problem of multiplicative models for large-scale positive response data with growing dimensions. First, we propose a communication-efficient least product relative error estimator which is the minimizer of a surrogate loss function that approximates the global least product relative error loss. Then, a practically efficient distributed Newton-Raphson algorithm is proposed to solve the problem. Theoretically, we show that the distributed estimator achieves the same statistical efficiency as the global estimator under regularity conditions when the dimension is fixed and increases with the local sample size, respectively. Second, by incorporating an adaptive lasso penalty into the surrogate loss, we develop a communication-efficient penalized least product relative error estimator for high-dimensional variable selection on massive positive response data. Accordingly, we propose a distributed algorithm based on the alternating direction method of multipliers. It is shown that the distributed penalized estimator has the oracle property in the growing-dimensional setting. Finally, extensive simulations and two real-world applications are conducted to demonstrate the superiority of our proposal. •A model is proposed to handle massive positive response data with growing dimensions.•Two new distributed estimators based on a relative error loss are proposed.•Two communication-efficient distributed algorithms are designed.•The convergence rate and asymptotic normality of the new estimators are derived.
ISSN:	0307-904X
DOI:	10.1016/j.apm.2024.07.013