Loading…

A review of machine learning approaches to Spam filtering

In this paper, we present a comprehensive review of recent developments in the application of machine learning algorithms to Spam filtering, focusing on both textual- and image-based approaches. Instead of considering Spam filtering as a standard classification problem, we highlight the importance o...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2009-09, Vol.36 (7), p.10206-10222
Main Authors: Guzella, Thiago S., Caminhas, Walmir M.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we present a comprehensive review of recent developments in the application of machine learning algorithms to Spam filtering, focusing on both textual- and image-based approaches. Instead of considering Spam filtering as a standard classification problem, we highlight the importance of considering specific characteristics of the problem, especially concept drift, in designing new filters. Two particularly important aspects not widely recognized in the literature are discussed: the difficulties in updating a classifier based on the bag-of-words representation and a major difference between two early naive Bayes models. Overall, we conclude that while important advancements have been made in the last years, several aspects remain to be explored, especially under more realistic evaluation settings.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2009.02.037