Loading…

Sentence Embeddings and Semantic Entity Extraction for Identification of Topics of Short Fact-Checked Claims

The objective of this research was to design a method to assign topics to claims debunked by fact-checking agencies. During the fact-checking process, access to more structured knowledge is necessary; therefore, we aim to describe topics with semantic vocabulary. Classification of topics should go b...

Full description

Saved in:
Bibliographic Details
Published in:Information (Basel) 2024-10, Vol.15 (10), p.659
Main Authors: Węcel, Krzysztof, Sawiński, Marcin, Lewoniewski, Włodzimierz, Stróżyna, Milena, Księżniak, Ewelina, Abramowicz, Witold
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The objective of this research was to design a method to assign topics to claims debunked by fact-checking agencies. During the fact-checking process, access to more structured knowledge is necessary; therefore, we aim to describe topics with semantic vocabulary. Classification of topics should go beyond simple connotations like instance-class and rather reflect broader phenomena that are recognized by fact checkers. The assignment of semantic entities is also crucial for the automatic verification of facts using the underlying knowledge graphs. Our method is based on sentence embeddings, various clustering methods (HDBSCAN, UMAP, K-means), semantic entity matching, and terms importance assessment based on TF-IDF. We represent our topics in semantic space using Wikidata Q-ids, DBpedia, Wikipedia topics, YAGO, and other relevant ontologies. Such an approach based on semantic entities also supports hierarchical navigation within topics. For evaluation, we compare topic modeling results with claims already tagged by fact checkers. The work presented in this paper is useful for researchers and practitioners interested in semantic topic modeling of fake news narratives.
ISSN:2078-2489
2078-2489
DOI:10.3390/info15100659