Loading…

Fake News Detection Using Machine Learning Techniques

A lot of information is spread by people in the social media to update their status and share crucial news with others. But the majority of these platforms don't promptly validate the individuals or their posts and people aren't able to identify the fake news manually. Therefore, there is...

Full description

Saved in:
Bibliographic Details
Main Authors: Sultana, Achhiya, Islam, Mahmudul, Hasan, Mahady, Ahmed, Farruk
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A lot of information is spread by people in the social media to update their status and share crucial news with others. But the majority of these platforms don't promptly validate the individuals or their posts and people aren't able to identify the fake news manually. Therefore, there is a need for an automated system capable of detecting fake news. This research has proposed to build a model using four machine learning algorithms. The dataset employed in the experiment is a composite of two datasets containing almost equal amounts of true and fake news articles on politics. The preprocessing stages begin with cleaning the data by removing punctuation, tokenization, special characters, white spaces, redundant word elimination, numerals, and English letters followed by stemming and stop with data discretization. Then, we analyzed the collected data and 80% of the data has been used to train each model initially. After that, the four manifested classification algorithms are applied. For identifying fake news from news articles, meth-ods like Logistic Regression, Decision Tree, Random Forest, and Gradient Boosting Classifier were used. The trained classifiers' accuracy has been evaluated using the remaining 20% of the data. The results show that the decision tree model produces the best accuracy of 99.60% and gradient boosting of 99.55%. Besides, the random forest shows 99.10% along with the logistic regression 98.99%. Moreover, we have explored the best model to achieve the highest precision, recall, F1-score based on the confusion matrix's outcome.
ISSN:2770-8209
DOI:10.1109/SERA57763.2023.10197712