Loading…

DisRFC: a dissimilarity-based Random Forest Clustering approach

•We present the first dissimilarity-based random forest-clustering approach.•The approach works only with distances, thus appropriate for non-vectorial objects.•The approach works also with non-metric dissimilarities.•We present a novel unsupervised RF variant working only with dissimilarities. In t...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2023-01, Vol.133, p.109036, Article 109036
Main Author: Bicego, Manuele
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We present the first dissimilarity-based random forest-clustering approach.•The approach works only with distances, thus appropriate for non-vectorial objects.•The approach works also with non-metric dissimilarities.•We present a novel unsupervised RF variant working only with dissimilarities. In this paper we present a novel Random Forest Clustering approach, called Dissimilarity Random Forest Clustering (DisRFC), which requires in input only pairwise dissimilarities. Thanks to this characteristic, the proposed approach is appliable to all those problems which involve non-vectorial representations, such as strings, sequences, graphs or 3D structures. In the proposed approach, we first train an Unsupervised Dissimilarity Random Forest (UD-RF), a novel variant of Random Forest which is completely unsupervised and based on dissimilarities. Then, we exploit the trained UD-RF to project the patterns to be clustered in a binary vectorial space, where the clustering is finally derived using fast and effective K-means procedures. In the paper we introduce different variants of DisRFC, which are thoroughly and positively evaluated on 12 different problems, also in comparison with alternative state-of-the-art approaches.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2022.109036