Loading…

Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation

•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2019-10, Vol.132, p.226-238
Main Authors: Magalhães, Luiz Felipe Gonçalves, Gonçalves, Marcos André, Canuto, Sérgio Daniel, Dalip, Daniel H., Cristo, Marco, Calado, Pável
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713
cites cdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713
container_end_page 238
container_issue
container_start_page 226
container_title Expert systems with applications
container_volume 132
creator Magalhães, Luiz Felipe Gonçalves
Gonçalves, Marcos André
Canuto, Sérgio Daniel
Dalip, Daniel H.
Cristo, Marco
Calado, Pável
description •We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.
doi_str_mv 10.1016/j.eswa.2019.04.053
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2249736887</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417419302830</els_id><sourcerecordid>2249736887</sourcerecordid><originalsourceid>FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</originalsourceid><addsrcrecordid>eNp9kEtLAzEUhYMoWKt_wFXA9Yx5TCcTcCPiCwoi6DqkmRtNmSY1STv035uhruUucsk959zkQ-iakpoS2t6ua0ijrhmhsiZNTRb8BM1oJ3jVCslP0YzIhagaKppzdJHSmhAqCBEzdHjf6cHlA9YpQUob8BkHi00YBr0KUWe3h-FQmQg6Q49HWJWZz5NsdPkb-4A32pcM7Mpt3JeBCx6vdCrq0qRgM97shuyqvYMRf4GHKTX4S3Rm9ZDg6u-co8-nx4-Hl2r59vz6cL-sDGddrnpNLSWkYZy20uqe9s1C29Zwyy3orl-Yhq045VJ2DFpTSjMpRCcN6VojKJ-jm2PuNoafHaSs1mEXfVmpGGuk4G1XOM0RO6pMDClFsGob3UbHg6JETYjVWk2I1YRYkUYVxMV0dzRBeX_5XlTJOPAGehfBZNUH95_9FxzEh78</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2249736887</pqid></control><display><type>article</type><title>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</title><source>ScienceDirect Freedom Collection</source><creator>Magalhães, Luiz Felipe Gonçalves ; Gonçalves, Marcos André ; Canuto, Sérgio Daniel ; Dalip, Daniel H. ; Cristo, Marco ; Calado, Pável</creator><creatorcontrib>Magalhães, Luiz Felipe Gonçalves ; Gonçalves, Marcos André ; Canuto, Sérgio Daniel ; Dalip, Daniel H. ; Cristo, Marco ; Calado, Pável</creatorcontrib><description>•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2019.04.053</identifier><language>eng</language><publisher>New York: Elsevier Ltd</publisher><subject>Automatic text quality assessment ; Automation ; Clusters ; Information retrieval ; Machine learning ; Multi-view ; Noise reduction ; Quality ; Quality assessment</subject><ispartof>Expert systems with applications, 2019-10, Vol.132, p.226-238</ispartof><rights>2019 Elsevier Ltd</rights><rights>Copyright Elsevier BV Oct 15, 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</citedby><cites>FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</cites><orcidid>0000-0002-2075-3363</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids></links><search><creatorcontrib>Magalhães, Luiz Felipe Gonçalves</creatorcontrib><creatorcontrib>Gonçalves, Marcos André</creatorcontrib><creatorcontrib>Canuto, Sérgio Daniel</creatorcontrib><creatorcontrib>Dalip, Daniel H.</creatorcontrib><creatorcontrib>Cristo, Marco</creatorcontrib><creatorcontrib>Calado, Pável</creatorcontrib><title>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</title><title>Expert systems with applications</title><description>•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.</description><subject>Automatic text quality assessment</subject><subject>Automation</subject><subject>Clusters</subject><subject>Information retrieval</subject><subject>Machine learning</subject><subject>Multi-view</subject><subject>Noise reduction</subject><subject>Quality</subject><subject>Quality assessment</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEUhYMoWKt_wFXA9Yx5TCcTcCPiCwoi6DqkmRtNmSY1STv035uhruUucsk959zkQ-iakpoS2t6ua0ijrhmhsiZNTRb8BM1oJ3jVCslP0YzIhagaKppzdJHSmhAqCBEzdHjf6cHlA9YpQUob8BkHi00YBr0KUWe3h-FQmQg6Q49HWJWZz5NsdPkb-4A32pcM7Mpt3JeBCx6vdCrq0qRgM97shuyqvYMRf4GHKTX4S3Rm9ZDg6u-co8-nx4-Hl2r59vz6cL-sDGddrnpNLSWkYZy20uqe9s1C29Zwyy3orl-Yhq045VJ2DFpTSjMpRCcN6VojKJ-jm2PuNoafHaSs1mEXfVmpGGuk4G1XOM0RO6pMDClFsGob3UbHg6JETYjVWk2I1YRYkUYVxMV0dzRBeX_5XlTJOPAGehfBZNUH95_9FxzEh78</recordid><startdate>20191015</startdate><enddate>20191015</enddate><creator>Magalhães, Luiz Felipe Gonçalves</creator><creator>Gonçalves, Marcos André</creator><creator>Canuto, Sérgio Daniel</creator><creator>Dalip, Daniel H.</creator><creator>Cristo, Marco</creator><creator>Calado, Pável</creator><general>Elsevier Ltd</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2075-3363</orcidid></search><sort><creationdate>20191015</creationdate><title>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</title><author>Magalhães, Luiz Felipe Gonçalves ; Gonçalves, Marcos André ; Canuto, Sérgio Daniel ; Dalip, Daniel H. ; Cristo, Marco ; Calado, Pável</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Automatic text quality assessment</topic><topic>Automation</topic><topic>Clusters</topic><topic>Information retrieval</topic><topic>Machine learning</topic><topic>Multi-view</topic><topic>Noise reduction</topic><topic>Quality</topic><topic>Quality assessment</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Magalhães, Luiz Felipe Gonçalves</creatorcontrib><creatorcontrib>Gonçalves, Marcos André</creatorcontrib><creatorcontrib>Canuto, Sérgio Daniel</creatorcontrib><creatorcontrib>Dalip, Daniel H.</creatorcontrib><creatorcontrib>Cristo, Marco</creatorcontrib><creatorcontrib>Calado, Pável</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Magalhães, Luiz Felipe Gonçalves</au><au>Gonçalves, Marcos André</au><au>Canuto, Sérgio Daniel</au><au>Dalip, Daniel H.</au><au>Cristo, Marco</au><au>Calado, Pável</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation</atitle><jtitle>Expert systems with applications</jtitle><date>2019-10-15</date><risdate>2019</risdate><volume>132</volume><spage>226</spage><epage>238</epage><pages>226-238</pages><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•We propose automatic ways for assessing the quality of collaborative Web content.•Our solutions exploit relaxed multiview learning (soft views).•Automatic soft views are created by finding clusters of highly correlated features.•Experiments on Wiki sets show that our solution reduces classification error by 20.•Our automatic views are very similar to those manually defined. Automated quality assessment of collaboratively created Web content is important to guarantee scalability and lack of bias. The state-of-the-art solution for this problem relies on multi-view learning, where quality is considered a multifaceted concept that can be learned from human assessments. To this effect, features describing quality have been devised and grouped into views based on criteria such as text structure, readability, style, user edit history, etc. The tasks of determining the views and properly combining them require the assistance of an expert, which is hard to do in scenarios where they are overlapping or hard to interpret by humans. In this work we propose an automatic view generator, specially designed for the problem of automated content quality assessment with no manual intervention. Automatic view generation is achieved by finding clusters of highly correlated features. This process is performed iteratively, by automatically creating new clusters, evaluating them, and keeping those that perform the best. Experiments on three popular Wiki datasets show that our automated views are able to reduce the classification error of the original features by up to 20%. This happens by automatically generating views that are very similar to those manually built, while keeping only a small set of features to reduce noise and overfitting.</abstract><cop>New York</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2019.04.053</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-2075-3363</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0957-4174
ispartof Expert systems with applications, 2019-10, Vol.132, p.226-238
issn 0957-4174
1873-6793
language eng
recordid cdi_proquest_journals_2249736887
source ScienceDirect Freedom Collection
subjects Automatic text quality assessment
Automation
Clusters
Information retrieval
Machine learning
Multi-view
Noise reduction
Quality
Quality assessment
title Quality assessment of collaboratively-created web content with no manual intervention based on soft multi-view generation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T06%3A25%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Quality%20assessment%20of%20collaboratively-created%20web%20content%20with%20no%20manual%20intervention%20based%20on%20soft%20multi-view%20generation&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Magalh%C3%A3es,%20Luiz%20Felipe%20Gon%C3%A7alves&rft.date=2019-10-15&rft.volume=132&rft.spage=226&rft.epage=238&rft.pages=226-238&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2019.04.053&rft_dat=%3Cproquest_cross%3E2249736887%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c328t-da1f100423169fad1d45af6c3f3fea8d5c42b3139982e6c6c6a297789c086c713%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2249736887&rft_id=info:pmid/&rfr_iscdi=true