Loading…
Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation
•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification...
Saved in:
Published in: | Expert systems with applications 2019-10, Vol.132, p.226-238 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined.
Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.04.053 |