Loading…

Dual-tree fast exact max-kernel search

The problem of max‐kernel search arises everywhere: given a query point \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$p_q$ \end{document}, a set of reference objects \documentclass{article}\usepackage{amsmath}...

Full description

Saved in:
Bibliographic Details
Published in:Statistical analysis and data mining 2014-08, Vol.7 (4), p.229-253
Main Authors: Curtin, Ryan R., Ram, Parikshit
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The problem of max‐kernel search arises everywhere: given a query point \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$p_q$ \end{document}, a set of reference objects \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$S_r$ \end{document} and some kernel \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$\mathcal{K}$ \end{document}, find \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$arg\,max_{p_r \in S_r} \mathcal{K}(p_q, p_r)$ \end{document}. Max‐kernel search is ubiquitous and appears in countless domains of science, thanks to the wide applicability of kernels. A few domains include image matching, information retrieval, bio‐informatics, similarity search, and collaborative filtering (to name just a few). However, there is no generalized technique for efficiently solving max‐kernel search. This paper presents a single‐tree algorithm called single‐tree FastMKS which returns the max‐kernel solution for a single query point in provably \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$O(\log N)$ \end{document} time (where \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$N$ \end{document} is the number of reference objects), and also a dual‐tree algorithm (dual‐tree FastMKS) which is useful for max‐kernel search with many query points. If the set of query points is of size \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$O(N)$ \end{document}, this algorithm returns a solution in provably \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$O(N)$ \end{document} time, which is significantly better than the \documentclass{article}\usepackage{amsmath}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{amsfonts}\pagestyle{empty}\begin{document}$O(N^2)$ \end{document} linear scan solution; these bounds are dependent on the expansion constant of the data. These algorithms work for
ISSN:1932-1864
1932-1872
DOI:10.1002/sam.11218