Loading…

Leveraging relevant summarized information and multi-layer classification to generalize the detection of misleading headlines

Disinformation is an important problem facing society nowadays. Given the rapid and easy access to information, news stories quickly go viral, the vast majority of which are misleading and with no prospect of verification. Specifically, the headline of a correctly designed news item must correspond...

Full description

Saved in:
Bibliographic Details
Published in:Data & knowledge engineering 2023-05, Vol.145, p.102176, Article 102176
Main Authors: Sepúlveda-Torres, Robiert, Vicente, Marta, Saquete, Estela, Lloret, Elena, Palomar, Manuel
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c348t-701acb4bb73d2253f1fd0b5d95ad8cf7947f1177fa77b50e4100d9fa1dc2d49f3
cites cdi_FETCH-LOGICAL-c348t-701acb4bb73d2253f1fd0b5d95ad8cf7947f1177fa77b50e4100d9fa1dc2d49f3
container_end_page
container_issue
container_start_page 102176
container_title Data & knowledge engineering
container_volume 145
creator Sepúlveda-Torres, Robiert
Vicente, Marta
Saquete, Estela
Lloret, Elena
Palomar, Manuel
description Disinformation is an important problem facing society nowadays. Given the rapid and easy access to information, news stories quickly go viral, the vast majority of which are misleading and with no prospect of verification. Specifically, the headline of a correctly designed news item must correspond to a summary of the main information of that news item and it should be neutral. However, many headlines circulating on the Internet use false or distorted information, seeking to confuse or mislead the reader. Misleading headlines indicate a dissonance between the headline and the content of the news story. From a computational perspective, this problem is being tackled as a Stance Detection problem between the headline and the body text of the news item. This paper contributes to the fight against the spread of misleading information by presenting a generic and flexible multi-level hierarchical classification. The approach is based on two stages that enable the detection of the stance between the news headline and the body text. The proposed architecture, called HeadlineStanceChecker+ uses the headline and only the essential information of the news item (not the full body text) as inputs. To extract this essential information, different summarization approaches (extractive and abstractive) are analyzed in order to determine the most relevant information for the task. The experimentation has been carried out using the Fake News Challenge (FNC-1) dataset. A 94.49% accuracy was obtained using extractive summaries, which were more helpful than abstractive ones. HeadlineStanceChecker+ improves the accuracy results of existing state-of-the-art systems. In conclusion, using automatic extractive summaries together with the two-stage generic architecture is an effective solution to the problem. •HeadlineStanceChecker+ is an enhanced system for misleading headlines detection.•Misleading headlines detection is tackled as a stance detection problem.•Summarization supports stance detection by only selecting relevant information.•A hierarchical classifier effectively determines the stance of a headline.•Results show that extractive approaches perform better for stance detection task.
doi_str_mv 10.1016/j.datak.2023.102176
format article
fullrecord <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_datak_2023_102176</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0169023X23000368</els_id><sourcerecordid>S0169023X23000368</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-701acb4bb73d2253f1fd0b5d95ad8cf7947f1177fa77b50e4100d9fa1dc2d49f3</originalsourceid><addsrcrecordid>eNp9UE1LAzEUDKJgrf4CL_kDW_Ox23QPHqT4BQUvCt5CNnlpU7NZSdKCgv_dtOvZ08CbN_PeDELXlMwoofOb7cyorD5mjDBeJoyK-Qma0IVg1bzl_BRNylZbFfb9HF2ktCWEsJo0E_Szgj1EtXZhjSN42KuQcdr1vYruGwx2wQ6xV9kNAatgcL_z2VVefUHE2quUnHV6pPOA1xCKmS9KnDeADWTQR26wuHfJgzKHQ5uC3gVIl-jMKp_g6g-n6O3h_nX5VK1eHp-Xd6tK83qRK0Go0l3ddYIbxhpuqTWka0zbKLPQVrS1sJQKYZUQXUOgpoSY1ipqNDN1a_kU8dFXxyGlCFZ-RlcifklK5KFBuZXHBuWhQTk2WFS3owrKa3sHUSbtIGgwLpZc0gzuX_0vVZN_FA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Leveraging relevant summarized information and multi-layer classification to generalize the detection of misleading headlines</title><source>Elsevier</source><creator>Sepúlveda-Torres, Robiert ; Vicente, Marta ; Saquete, Estela ; Lloret, Elena ; Palomar, Manuel</creator><creatorcontrib>Sepúlveda-Torres, Robiert ; Vicente, Marta ; Saquete, Estela ; Lloret, Elena ; Palomar, Manuel</creatorcontrib><description>Disinformation is an important problem facing society nowadays. Given the rapid and easy access to information, news stories quickly go viral, the vast majority of which are misleading and with no prospect of verification. Specifically, the headline of a correctly designed news item must correspond to a summary of the main information of that news item and it should be neutral. However, many headlines circulating on the Internet use false or distorted information, seeking to confuse or mislead the reader. Misleading headlines indicate a dissonance between the headline and the content of the news story. From a computational perspective, this problem is being tackled as a Stance Detection problem between the headline and the body text of the news item. This paper contributes to the fight against the spread of misleading information by presenting a generic and flexible multi-level hierarchical classification. The approach is based on two stages that enable the detection of the stance between the news headline and the body text. The proposed architecture, called HeadlineStanceChecker+ uses the headline and only the essential information of the news item (not the full body text) as inputs. To extract this essential information, different summarization approaches (extractive and abstractive) are analyzed in order to determine the most relevant information for the task. The experimentation has been carried out using the Fake News Challenge (FNC-1) dataset. A 94.49% accuracy was obtained using extractive summaries, which were more helpful than abstractive ones. HeadlineStanceChecker+ improves the accuracy results of existing state-of-the-art systems. In conclusion, using automatic extractive summaries together with the two-stage generic architecture is an effective solution to the problem. •HeadlineStanceChecker+ is an enhanced system for misleading headlines detection.•Misleading headlines detection is tackled as a stance detection problem.•Summarization supports stance detection by only selecting relevant information.•A hierarchical classifier effectively determines the stance of a headline.•Results show that extractive approaches perform better for stance detection task.</description><identifier>ISSN: 0169-023X</identifier><identifier>EISSN: 1872-6933</identifier><identifier>DOI: 10.1016/j.datak.2023.102176</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Fake news ; Misleading headlines ; Natural language processing ; Stance detection</subject><ispartof>Data &amp; knowledge engineering, 2023-05, Vol.145, p.102176, Article 102176</ispartof><rights>2023 The Author(s)</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-701acb4bb73d2253f1fd0b5d95ad8cf7947f1177fa77b50e4100d9fa1dc2d49f3</citedby><cites>FETCH-LOGICAL-c348t-701acb4bb73d2253f1fd0b5d95ad8cf7947f1177fa77b50e4100d9fa1dc2d49f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Sepúlveda-Torres, Robiert</creatorcontrib><creatorcontrib>Vicente, Marta</creatorcontrib><creatorcontrib>Saquete, Estela</creatorcontrib><creatorcontrib>Lloret, Elena</creatorcontrib><creatorcontrib>Palomar, Manuel</creatorcontrib><title>Leveraging relevant summarized information and multi-layer classification to generalize the detection of misleading headlines</title><title>Data &amp; knowledge engineering</title><description>Disinformation is an important problem facing society nowadays. Given the rapid and easy access to information, news stories quickly go viral, the vast majority of which are misleading and with no prospect of verification. Specifically, the headline of a correctly designed news item must correspond to a summary of the main information of that news item and it should be neutral. However, many headlines circulating on the Internet use false or distorted information, seeking to confuse or mislead the reader. Misleading headlines indicate a dissonance between the headline and the content of the news story. From a computational perspective, this problem is being tackled as a Stance Detection problem between the headline and the body text of the news item. This paper contributes to the fight against the spread of misleading information by presenting a generic and flexible multi-level hierarchical classification. The approach is based on two stages that enable the detection of the stance between the news headline and the body text. The proposed architecture, called HeadlineStanceChecker+ uses the headline and only the essential information of the news item (not the full body text) as inputs. To extract this essential information, different summarization approaches (extractive and abstractive) are analyzed in order to determine the most relevant information for the task. The experimentation has been carried out using the Fake News Challenge (FNC-1) dataset. A 94.49% accuracy was obtained using extractive summaries, which were more helpful than abstractive ones. HeadlineStanceChecker+ improves the accuracy results of existing state-of-the-art systems. In conclusion, using automatic extractive summaries together with the two-stage generic architecture is an effective solution to the problem. •HeadlineStanceChecker+ is an enhanced system for misleading headlines detection.•Misleading headlines detection is tackled as a stance detection problem.•Summarization supports stance detection by only selecting relevant information.•A hierarchical classifier effectively determines the stance of a headline.•Results show that extractive approaches perform better for stance detection task.</description><subject>Fake news</subject><subject>Misleading headlines</subject><subject>Natural language processing</subject><subject>Stance detection</subject><issn>0169-023X</issn><issn>1872-6933</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9UE1LAzEUDKJgrf4CL_kDW_Ox23QPHqT4BQUvCt5CNnlpU7NZSdKCgv_dtOvZ08CbN_PeDELXlMwoofOb7cyorD5mjDBeJoyK-Qma0IVg1bzl_BRNylZbFfb9HF2ktCWEsJo0E_Szgj1EtXZhjSN42KuQcdr1vYruGwx2wQ6xV9kNAatgcL_z2VVefUHE2quUnHV6pPOA1xCKmS9KnDeADWTQR26wuHfJgzKHQ5uC3gVIl-jMKp_g6g-n6O3h_nX5VK1eHp-Xd6tK83qRK0Go0l3ddYIbxhpuqTWka0zbKLPQVrS1sJQKYZUQXUOgpoSY1ipqNDN1a_kU8dFXxyGlCFZ-RlcifklK5KFBuZXHBuWhQTk2WFS3owrKa3sHUSbtIGgwLpZc0gzuX_0vVZN_FA</recordid><startdate>202305</startdate><enddate>202305</enddate><creator>Sepúlveda-Torres, Robiert</creator><creator>Vicente, Marta</creator><creator>Saquete, Estela</creator><creator>Lloret, Elena</creator><creator>Palomar, Manuel</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>202305</creationdate><title>Leveraging relevant summarized information and multi-layer classification to generalize the detection of misleading headlines</title><author>Sepúlveda-Torres, Robiert ; Vicente, Marta ; Saquete, Estela ; Lloret, Elena ; Palomar, Manuel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-701acb4bb73d2253f1fd0b5d95ad8cf7947f1177fa77b50e4100d9fa1dc2d49f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Fake news</topic><topic>Misleading headlines</topic><topic>Natural language processing</topic><topic>Stance detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sepúlveda-Torres, Robiert</creatorcontrib><creatorcontrib>Vicente, Marta</creatorcontrib><creatorcontrib>Saquete, Estela</creatorcontrib><creatorcontrib>Lloret, Elena</creatorcontrib><creatorcontrib>Palomar, Manuel</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><jtitle>Data &amp; knowledge engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sepúlveda-Torres, Robiert</au><au>Vicente, Marta</au><au>Saquete, Estela</au><au>Lloret, Elena</au><au>Palomar, Manuel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Leveraging relevant summarized information and multi-layer classification to generalize the detection of misleading headlines</atitle><jtitle>Data &amp; knowledge engineering</jtitle><date>2023-05</date><risdate>2023</risdate><volume>145</volume><spage>102176</spage><pages>102176-</pages><artnum>102176</artnum><issn>0169-023X</issn><eissn>1872-6933</eissn><abstract>Disinformation is an important problem facing society nowadays. Given the rapid and easy access to information, news stories quickly go viral, the vast majority of which are misleading and with no prospect of verification. Specifically, the headline of a correctly designed news item must correspond to a summary of the main information of that news item and it should be neutral. However, many headlines circulating on the Internet use false or distorted information, seeking to confuse or mislead the reader. Misleading headlines indicate a dissonance between the headline and the content of the news story. From a computational perspective, this problem is being tackled as a Stance Detection problem between the headline and the body text of the news item. This paper contributes to the fight against the spread of misleading information by presenting a generic and flexible multi-level hierarchical classification. The approach is based on two stages that enable the detection of the stance between the news headline and the body text. The proposed architecture, called HeadlineStanceChecker+ uses the headline and only the essential information of the news item (not the full body text) as inputs. To extract this essential information, different summarization approaches (extractive and abstractive) are analyzed in order to determine the most relevant information for the task. The experimentation has been carried out using the Fake News Challenge (FNC-1) dataset. A 94.49% accuracy was obtained using extractive summaries, which were more helpful than abstractive ones. HeadlineStanceChecker+ improves the accuracy results of existing state-of-the-art systems. In conclusion, using automatic extractive summaries together with the two-stage generic architecture is an effective solution to the problem. •HeadlineStanceChecker+ is an enhanced system for misleading headlines detection.•Misleading headlines detection is tackled as a stance detection problem.•Summarization supports stance detection by only selecting relevant information.•A hierarchical classifier effectively determines the stance of a headline.•Results show that extractive approaches perform better for stance detection task.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.datak.2023.102176</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0169-023X
ispartof Data & knowledge engineering, 2023-05, Vol.145, p.102176, Article 102176
issn 0169-023X
1872-6933
language eng
recordid cdi_crossref_primary_10_1016_j_datak_2023_102176
source Elsevier
subjects Fake news
Misleading headlines
Natural language processing
Stance detection
title Leveraging relevant summarized information and multi-layer classification to generalize the detection of misleading headlines
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T23%3A06%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Leveraging%20relevant%20summarized%20information%20and%20multi-layer%20classification%20to%20generalize%20the%20detection%20of%20misleading%20headlines&rft.jtitle=Data%20&%20knowledge%20engineering&rft.au=Sep%C3%BAlveda-Torres,%20Robiert&rft.date=2023-05&rft.volume=145&rft.spage=102176&rft.pages=102176-&rft.artnum=102176&rft.issn=0169-023X&rft.eissn=1872-6933&rft_id=info:doi/10.1016/j.datak.2023.102176&rft_dat=%3Celsevier_cross%3ES0169023X23000368%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c348t-701acb4bb73d2253f1fd0b5d95ad8cf7947f1177fa77b50e4100d9fa1dc2d49f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true