Loading…

Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation

•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification...

Full description

Saved in:

Bibliographic Details
Published in:	Expert systems with applications 2019-10, Vol.132, p.226-238
Main Authors:	Magalhães, Luiz Felipe Gonçalves, Gonçalves, Marcos André, Canuto, Sérgio Daniel, Dalip, Daniel H., Cristo, Marco, Calado, Pável
Format:	Article
Language:	English
Subjects:	Automatic text quality assessment Automation Clusters Information retrieval Machine learning Multi-view Noise reduction Quality Quality assessment
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2019.04.053