Loading…
Semantic Weighted Multi-View Clustering for Web Content
Clustering is a long-standing important research problem. However, it remains challenging when handling large-scale web data from different types of information resources such as user profile, comments, user preferences and so on. All these aspects can be seen as different views and often admit the...
Saved in:
Published in: | IEEE access 2019, Vol.7, p.128097-128113 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Clustering is a long-standing important research problem. However, it remains challenging when handling large-scale web data from different types of information resources such as user profile, comments, user preferences and so on. All these aspects can be seen as different views and often admit the same underlying clustering of the data. In this paper, we present a novel Semantic Weighted Non-negative Matrix Factorization ( SWNMF ) multi-view clustering framework, which can provide an efficient weighted matrix factorization framework, dexterously manipulate multi-view web content, and easily explore the sparseness problem in semantic space of data. Specifically, each view of dataset forming a huge sparse matrix, which results in the non-robust characteristic during the matrix decomposition process, and further influences the accuracy of clustering results. To address above problem, we attempt to use some preference information (e.g. rating values) given by the users as latent semantic information to handle those features that are unobserved in each data point so as to resolve the sparseness problem in all views matrices. To combine multiple views in our large corpus, the overall objective of our proposed SWNMF is to minimize the loss function of weighted non-negative matrix factorization (NMF) under the l_{2,1} -norm and the co-regularized constraint under the F -norm. Extensive experiments on our large-scale multi-view web datasets demonstrate the competitive performance of our solution. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2939334 |