Loading…

Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications

The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a l...

Full description

Saved in:
Bibliographic Details
Published in:Mathematics (Basel) 2021-11, Vol.9 (22), p.2988
Main Authors: Guimarães, Nuno, Figueira, Álvaro, Torgo, Luís
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a long-term scenario where the topics discussed may change is lacking. In this work, we deviate from current approaches to the problem and instead focus on a longitudinal evaluation using social network publications spanning an 18-month period. We evaluate different combinations of features and supervised models in a long-term scenario where the training and testing data are ordered chronologically, and thus the robustness and stability of the models can be evaluated through time. We experimented with 3 different scenarios where the models are trained with 15-, 30-, and 60-day data periods. The results show that detection models trained with word-embedding features are the ones that perform better and are less likely to be affected by the change of topics (for example, the rise of COVID-19 conspiracy theories). Furthermore, the additional days of training data also increase the performance of the best feature/model combinations, although not very significantly (around 2%). The results presented in this paper build the foundations towards a more pragmatic approach to the evaluation of fake news detection models in social networks.
ISSN:2227-7390
2227-7390
DOI:10.3390/math9222988