Loading…
Sparse estimation in semiparametric finite mixture of varying coefficient regression models
Finite mixture of regressions (FMR) are commonly used to model heterogeneous effects of covariates on a response variable in settings where there are unknown underlying subpopulations. FMRs, however, cannot accommodate situations where covariates' effects also vary according to an “index” varia...
Saved in:
Published in: | Biometrics 2023-12, Vol.79 (4), p.3445-3457 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Finite mixture of regressions (FMR) are commonly used to model heterogeneous effects of covariates on a response variable in settings where there are unknown underlying subpopulations. FMRs, however, cannot accommodate situations where covariates' effects also vary according to an “index” variable—known as finite mixture of varying coefficient regression (FM‐VCR). Although complex, this situation occurs in real data applications: the osteocalcin (OCN) data analyzed in this manuscript presents a heterogeneous relationship where the effect of a genetic variant on OCN in each hidden subpopulation varies over time. Oftentimes, the number of covariates with varying coefficients also presents a challenge: in the OCN study, genetic variants on the same chromosome are considered jointly. The relative proportions of hidden subpopulations may also change over time. Nevertheless, existing methods cannot provide suitable solutions for accommodating all these features in real data applications. To fill this gap, we develop statistical methodologies based on regularized local‐kernel likelihood for simultaneous parameter estimation and variable selection in sparse FM‐VCR models. We study large‐sample properties of the proposed methods. We then carry out a simulation study to evaluate the performance of various penalties adopted for our regularized approach and ascertain the ability of a BIC‐type criterion for estimating the number of subpopulations. Finally, we applied the FM‐VCR model to analyze the OCN data and identified several covariates, including genetic variants, that have age‐dependent effects on OCN. |
---|---|
ISSN: | 0006-341X 1541-0420 |
DOI: | 10.1111/biom.13870 |