Loading…

Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation

•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification...

Full description

Saved in:

Bibliographic Details
Published in:	Expert systems with applications 2019-10, Vol.132, p.226-238
Main Authors:	Magalhães, Luiz Felipe Gonçalves, Gonçalves, Marcos André, Canuto, Sérgio Daniel, Dalip, Daniel H., Cristo, Marco, Calado, Pável
Format:	Article
Language:	English
Subjects:	Automatic text quality assessment Automation Clusters Information retrieval Machine learning Multi-view Noise reduction Quality Quality assessment
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713
cites	cdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713
container_end_page	238
container_issue
container_start_page	226
container_title	Expert systems with applications
container_volume	132
creator	Magalhães, Luiz Felipe Gonçalves Gonçalves, Marcos André Canuto, Sérgio Daniel Dalip, Daniel H. Cristo, Marco Calado, Pável
description	•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.
doi_str_mv	10.1016/j.eswa.2019.04.053
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2249736887</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417419302830</els_id><sourcerecordid>2249736887</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</originalsourceid><addsrcrecordid>eNp9kEtLAzEUhYMoWKt_wFXA9Yx5TCcTcCPiCwoi6DqkmRtNmSY1STv035uhruUucsk959zkQ-iakpoS2t6ua0ijrhmhsiZNTRb8BM1oJ3jVCslP0YzIhagaKppzdJHSmhAqCBEzdHjf6cHlA9YpQUob8BkHi00YBr0KUWe3h-FQmQg6Q49HWJWZz5NsdPkb-4A32pcM7Mpt3JeBCx6vdCrq0qRgM97shuyqvYMRf4GHKTX4S3Rm9ZDg6u-co8-nx4-Hl2r59vz6cL-sDGddrnpNLSWkYZy20uqe9s1C29Zwyy3orl-Yhq045VJ2DFpTSjMpRCcN6VojKJ-jm2PuNoafHaSs1mEXfVmpGGuk4G1XOM0RO6pMDClFsGob3UbHg6JETYjVWk2I1YRYkUYVxMV0dzRBeX_5XlTJOPAGehfBZNUH95_9FxzEh78</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2249736887</pqid></control><display><type>article</type><title>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</title><source>ScienceDirect Freedom Collection</source><creator>Magalhães, Luiz Felipe Gonçalves ; Gonçalves, Marcos André ; Canuto, Sérgio Daniel ; Dalip, Daniel H. ; Cristo, Marco ; Calado, Pável</creator><creatorcontrib>Magalhães, Luiz Felipe Gonçalves ; Gonçalves, Marcos André ; Canuto, Sérgio Daniel ; Dalip, Daniel H. ; Cristo, Marco ; Calado, Pável</creatorcontrib><description>•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2019.04.053</identifier><language>eng</language><publisher>New York: Elsevier Ltd</publisher><subject>Automatic text quality assessment ; Automation ; Clusters ; Information retrieval ; Machine learning ; Multi-view ; Noise reduction ; Quality ; Quality assessment</subject><ispartof>Expert systems with applications, 2019-10, Vol.132, p.226-238</ispartof><rights>2019 Elsevier Ltd</rights><rights>Copyright Elsevier BV Oct 15, 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</citedby><cites>FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</cites><orcidid>0000-0002-2075-3363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids></links><search><creatorcontrib>Magalhães, Luiz Felipe Gonçalves</creatorcontrib><creatorcontrib>Gonçalves, Marcos André</creatorcontrib><creatorcontrib>Canuto, Sérgio Daniel</creatorcontrib><creatorcontrib>Dalip, Daniel H.</creatorcontrib><creatorcontrib>Cristo, Marco</creatorcontrib><creatorcontrib>Calado, Pável</creatorcontrib><title>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</title><title>Expert systems with applications</title><description>•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.</description><subject>Automatic text quality assessment</subject><subject>Automation</subject><subject>Clusters</subject><subject>Information retrieval</subject><subject>Machine learning</subject><subject>Multi-view</subject><subject>Noise reduction</subject><subject>Quality</subject><subject>Quality assessment</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEUhYMoWKt_wFXA9Yx5TCcTcCPiCwoi6DqkmRtNmSY1STv035uhruUucsk959zkQ-iakpoS2t6ua0ijrhmhsiZNTRb8BM1oJ3jVCslP0YzIhagaKppzdJHSmhAqCBEzdHjf6cHlA9YpQUob8BkHi00YBr0KUWe3h-FQmQg6Q49HWJWZz5NsdPkb-4A32pcM7Mpt3JeBCx6vdCrq0qRgM97shuyqvYMRf4GHKTX4S3Rm9ZDg6u-co8-nx4-Hl2r59vz6cL-sDGddrnpNLSWkYZy20uqe9s1C29Zwyy3orl-Yhq045VJ2DFpTSjMpRCcN6VojKJ-jm2PuNoafHaSs1mEXfVmpGGuk4G1XOM0RO6pMDClFsGob3UbHg6JETYjVWk2I1YRYkUYVxMV0dzRBeX_5XlTJOPAGehfBZNUH95_9FxzEh78</recordid><startdate>20191015</startdate><enddate>20191015</enddate><creator>Magalhães, Luiz Felipe Gonçalves</creator><creator>Gonçalves, Marcos André</creator><creator>Canuto, Sérgio Daniel</creator><creator>Dalip, Daniel H.</creator><creator>Cristo, Marco</creator><creator>Calado, Pável</creator><general>Elsevier Ltd</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2075-3363</orcidid></search><sort><creationdate>20191015</creationdate><title>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</title><author>Magalhães, Luiz Felipe Gonçalves ; Gonçalves, Marcos André ; Canuto, Sérgio Daniel ; Dalip, Daniel H. ; Cristo, Marco ; Calado, Pável</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Automatic text quality assessment</topic><topic>Automation</topic><topic>Clusters</topic><topic>Information retrieval</topic><topic>Machine learning</topic><topic>Multi-view</topic><topic>Noise reduction</topic><topic>Quality</topic><topic>Quality assessment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Magalhães, Luiz Felipe Gonçalves</creatorcontrib><creatorcontrib>Gonçalves, Marcos André</creatorcontrib><creatorcontrib>Canuto, Sérgio Daniel</creatorcontrib><creatorcontrib>Dalip, Daniel H.</creatorcontrib><creatorcontrib>Cristo, Marco</creatorcontrib><creatorcontrib>Calado, Pável</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Magalhães, Luiz Felipe Gonçalves</au><au>Gonçalves, Marcos André</au><au>Canuto, Sérgio Daniel</au><au>Dalip, Daniel H.</au><au>Cristo, Marco</au><au>Calado, Pável</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</atitle><jtitle>Expert systems with applications</jtitle><date>2019-10-15</date><risdate>2019</risdate><volume>132</volume><spage>226</spage><epage>238</epage><pages>226-238</pages><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.</abstract><cop>New York</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2019.04.053</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-2075-3363</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0957-4174
ispartof	Expert systems with applications, 2019-10, Vol.132, p.226-238
issn	0957-4174 1873-6793
language	eng
recordid	cdi_proquest_journals_2249736887
source	ScienceDirect Freedom Collection
subjects	Automatic text quality assessment Automation Clusters Information retrieval Machine learning Multi-view Noise reduction Quality Quality assessment
title	Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T06%3A25%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Quality%20assessment%20of%20collaboratively-created%20web%20content%20with%20no%20manual%20intervention%20based%20on%20soft%20multi-view%20generation&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Magalh%C3%A3es,%20Luiz%20Felipe%20Gon%C3%A7alves&rft.date=2019-10-15&rft.volume=132&rft.spage=226&rft.epage=238&rft.pages=226-238&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2019.04.053&rft_dat=%3Cproquest_cross%3E2249736887%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2249736887&rft_id=info:pmid/&rfr_iscdi=true