Loading…
High-performance sampling of generic determinantal point processes
Determinantal point processes (DPPs) were introduced by Macchi (Macchi 1975 Adv. Appl. Probab. 7, 83–122) as a model for repulsive (fermionic) particle distributions. But their recent popularization is largely due to their usefulness for encouraging diversity in the final stage of a recommender syst...
Saved in:
Published in: | Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences physical, and engineering sciences, 2020-03, Vol.378 (2166), p.1-17 |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Determinantal point processes (DPPs) were introduced by Macchi (Macchi 1975 Adv. Appl. Probab. 7, 83–122) as a model for repulsive (fermionic) particle distributions. But their recent popularization is largely due to their usefulness for encouraging diversity in the final stage of a recommender system (Kulesza & Taskar 2012 Found. Trends Mach. Learn. 5, 123–286). The standard sampling scheme for finite DPPs is a spectral decomposition followed by an equivalent of a randomly diagonally pivoted Cholesky factorization of an orthogonal projection, which is only applicable to Hermitian kernels and has an expensive set-up cost. Researchers Launay et al. 2018 (http://arxiv.org/abs/1802.08429); Chen & Zhang 2018 NeurIPS (https://papers.nips.cc/paper/7805-fast-greedy-mapin-ference-for-determinantal-point-process-to-improverecommendation-diversity.pdf) have begun to connect DPP sampling to LDLH
factorizations as a means of avoiding the initial spectral decomposition, but existing approaches have only outperformed the spectral decomposition approach in special circumstances, where the number of kept modes is a small percentage of the ground set size. This article proves that trivial modifications of LU and LDLH
factorizations yield efficient direct sampling schemes for non-Hermitian and Hermitian DPP kernels, respectively. Furthermore, it is experimentally shown that even dynamically scheduled, shared-memory parallelizations of high-performance dense and sparse-direct factorizations can be trivially modified to yield DPP sampling schemes with essentially identical performance. The software developed as part of this research, Catamari (hodgestar.com/catamari) is released under the Mozilla Public License v.2.0. It contains header-only, C++14 plus OpenMP 4.0 implementations of dense and sparse-direct, Hermitian and non-Hermitian DPP samplers.
This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’. |
---|---|
ISSN: | 1364-503X 1471-2962 |