Loading…
ORKM: Online regularized K-means clustering for online multi-view data
Data generated from different sources are sometimes referred to as multi-view data, and as online multi-view data if a time dimension is involved in generating the data. This paper concerns clustering online multi-view data where overfitting and computation intensity are existent challenges. Here we...
Saved in:
Published in: | Information sciences 2024-10, Vol.680, p.121133, Article 121133 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Data generated from different sources are sometimes referred to as multi-view data, and as online multi-view data if a time dimension is involved in generating the data. This paper concerns clustering online multi-view data where overfitting and computation intensity are existent challenges. Here we propose an Online Regularized K-Means Clustering (ORKMC) method to tackle these challenges. Specifically, we use a matrix factorization strategy to identify the cluster indicator matrix and cluster mean matrix for all generated data points; and this strategy also includes a clustering complexity regularization term to harness the possible overfitting or overclustering. To reduce computation intensity, we propose an online update step in clustering where clustering is performed on only the latest view data at each update. Through a simulation study and analysis of two real-world data examples, we show that the proposed ORKMC method performs better than the current widely-used clustering methods in terms of clustering accuracy and computation efficiency. Finally, we develop an R package ORKM to implement ORKMC. |
---|---|
ISSN: | 0020-0255 |
DOI: | 10.1016/j.ins.2024.121133 |