Loading…
Semi-supervised data clustering using particle swarm optimisation
In this study, we propose the semi-supervised particle swarm optimisation (ssPSO) algorithm for data clustering. The algorithm takes advantage of the strengths of semi-supervised fuzzy c-means (ssFCM) and particle swarm optimisation (PSO) to allow for a more informed search using labelled data acros...
Saved in:
Published in: | Soft computing (Berlin, Germany) Germany), 2020-03, Vol.24 (5), p.3499-3510 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this study, we propose the semi-supervised particle swarm optimisation (ssPSO) algorithm for data clustering. The algorithm takes advantage of the strengths of semi-supervised fuzzy c-means (ssFCM) and particle swarm optimisation (PSO) to allow for a more informed search using labelled data across small number of iterations while maintaining diversity in the search process. ssFCM algorithms can find meaningful clusters using available labelled data to guide the learning process. PSOs are often chosen to solve clustering problems due to their versatility in problem representation and exploration capabilities. To verify the goodness of ssPSOs and provide practical insights to researchers, the clustering performances and clustering behaviours of ssPSOs are investigated and compared with PSO variants and ssFCMs. Two approaches of ssPSO were studied, one applied at initialisation only and the other throughout the learning process. Evaluated based on accuracy and quantisation error (QE), the ssPSO, PSOs and ssFCM algorithms were tested on 13 UCI datasets with different sizes, dimensions, number of classes and distribution, exploring several swarm size and maximum iteration settings over 100 runs. Visual examination of biplots and convergence graphs was conducted. ssPSOs were found to perform competitively well with ssFCM in most datasets in terms of accuracy and outperform ssFCM in terms of QE using swarm size 20 and maximum iteration 20. The results demonstrate that ssPSOs perform particularly well in sparsely distributed datasets with overlapping clusters and produce clusters with better structures in terms of QE. Furthermore, ssPSOs were demonstrated to perform competitively well as ssFCM in datasets with more than three clusters, while QPSO performed poorly in such datasets. |
---|---|
ISSN: | 1432-7643 1433-7479 |
DOI: | 10.1007/s00500-019-04114-z |