Loading…

A multivariate extension of a vector of two-parameter Poisson-Dirichlet processes

In the big data era there is a growing need to model the main features of large and non-trivial data sets. This paper proposes a Bayesian nonparametric prior for modelling situations where data are divided into different units with different densities, allowing information pooling across the groups....

Full description

Saved in:
Bibliographic Details
Published in:Journal of nonparametric statistics 2015-01, Vol.27 (1), p.89-105
Main Authors: Zhu, Weixuan, Leisen, Fabrizio
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the big data era there is a growing need to model the main features of large and non-trivial data sets. This paper proposes a Bayesian nonparametric prior for modelling situations where data are divided into different units with different densities, allowing information pooling across the groups. Leisen and Lijoi [(2011), 'Vectors of Poisson-Dirichlet processes', J. Multivariate Anal., 102, 482-495] introduced a bivariate vector of random probability measures with Poisson-Dirichlet marginals where the dependence is induced through a Lévy's Copula. In this paper the same approach is used for generalising such a vector to the multivariate setting. A first important contribution is the derivation of the Laplace functional transform which is non-trivial in the multivariate setting. The Laplace transform is the basis to derive the exchangeable partition probability function (EPPF) and, as a second contribution, we provide an expression of the EPPF for the multivariate setting. Finally, a novel Markov Chain Monte Carlo algorithm for evaluating the EPPF is introduced and tested. In particular, numerical illustrations of the clustering behaviour of the new prior are provided.
ISSN:1048-5252
1029-0311
DOI:10.1080/10485252.2014.966103