Loading…

ORKM: Online regularized K-means clustering for online multi-view data

Data generated from different sources are sometimes referred to as multi-view data, and as online multi-view data if a time dimension is involved in generating the data. This paper concerns clustering online multi-view data where overfitting and computation intensity are existent challenges. Here we...

Full description

Saved in:
Bibliographic Details
Published in:Information sciences 2024-10, Vol.680, p.121133, Article 121133
Main Authors: Guo, Guangbao, Yu, Miao, Qian, Guoqi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data generated from different sources are sometimes referred to as multi-view data, and as online multi-view data if a time dimension is involved in generating the data. This paper concerns clustering online multi-view data where overfitting and computation intensity are existent challenges. Here we propose an Online Regularized K-Means Clustering (ORKMC) method to tackle these challenges. Specifically, we use a matrix factorization strategy to identify the cluster indicator matrix and cluster mean matrix for all generated data points; and this strategy also includes a clustering complexity regularization term to harness the possible overfitting or overclustering. To reduce computation intensity, we propose an online update step in clustering where clustering is performed on only the latest view data at each update. Through a simulation study and analysis of two real-world data examples, we show that the proposed ORKMC method performs better than the current widely-used clustering methods in terms of clustering accuracy and computation efficiency. Finally, we develop an R package ORKM to implement ORKMC.
ISSN:0020-0255
DOI:10.1016/j.ins.2024.121133