Loading…

Identification of non-disjoint clusters with small and parameterizable overlaps

Identification of non-disjoint groups in unlabeled data sets is an important issue in clustering. Many real life applications require to find overlapping clusters in order to fit the data set structures such as clustering of films where each film can have different genres. This paper presents an ove...

Full description

Saved in:
Bibliographic Details
Main Authors: Ben N'Cir, Chiheb-Eddine, Cleuziou, G., Essoussi, N.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Identification of non-disjoint groups in unlabeled data sets is an important issue in clustering. Many real life applications require to find overlapping clusters in order to fit the data set structures such as clustering of films where each film can have different genres. This paper presents an overlapping k-means method refereed as Restricted-OKM (Restricted Overlapping k-means) that generalizes the well known k-means algorithm to detect overlapping clusters. The proposed method produces restricted overlapping boundaries between clusters and improves clustering accuracy to make the method adapted for clustering data with small overlaps. The proposed method is extended to control sizes of overlaps between clusters with respect to user expectations. Experiments, performed on overlapping data sets, show that proposed methods outperform OKM (Overlapping k-means) and fuzzy c-means in terms of clustering accuracy and produce clusters with small overlapping boundaries.
DOI:10.1109/ICCAT.2013.6522010