Loading…
A multivariate extension of a vector of two-parameter Poisson-Dirichlet processes
In the big data era there is a growing need to model the main features of large and non-trivial data sets. This paper proposes a Bayesian nonparametric prior for modelling situations where data are divided into different units with different densities, allowing information pooling across the groups....
Saved in:
Published in: | Journal of nonparametric statistics 2015-01, Vol.27 (1), p.89-105 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In the big data era there is a growing need to model the main features of large and non-trivial data sets. This paper proposes a Bayesian nonparametric prior for modelling situations where data are divided into different units with different densities, allowing information pooling across the groups. Leisen and Lijoi [(2011), 'Vectors of Poisson-Dirichlet processes', J. Multivariate Anal., 102, 482-495] introduced a bivariate vector of random probability measures with Poisson-Dirichlet marginals where the dependence is induced through a Lévy's Copula. In this paper the same approach is used for generalising such a vector to the multivariate setting. A first important contribution is the derivation of the Laplace functional transform which is non-trivial in the multivariate setting. The Laplace transform is the basis to derive the exchangeable partition probability function (EPPF) and, as a second contribution, we provide an expression of the EPPF for the multivariate setting. Finally, a novel Markov Chain Monte Carlo algorithm for evaluating the EPPF is introduced and tested. In particular, numerical illustrations of the clustering behaviour of the new prior are provided. |
---|---|
ISSN: | 1048-5252 1029-0311 |
DOI: | 10.1080/10485252.2014.966103 |