Loading…

Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach

Due to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded \ 2 billion in 2021. Projections show transaction values will exceed \ 3 billion by the end of 2022, and Sub-Saharan Africa contributes half of the daily transactions. SMS (S...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2022, Vol.10, p.83061-83074
Main Authors:	Mambina, Iddi S., Ndibwile, Jema D., Michael, Kisangiri F.
Format:	Article
Language:	English
Subjects:	Africa African languages Classification Classifiers Electronic commerce Languages Machine learning Malware Messages mobile money Model accuracy Natural language processing Performance evaluation Phishing Short message service smishing SMS social engineering Sociology Statistics Sub-Saharan Africa Text messaging Unsolicited e-mail
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Due to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded \ 2 billion in 2021. Projections show transaction values will exceed \ 3 billion by the end of 2022, and Sub-Saharan Africa contributes half of the daily transactions. SMS (Short Message Service) phishing cost corporations and individuals millions of dollars annually. Spammers use Smishing (SMS Phishing) messages to trick a mobile money user into sending electronic cash to an unintended mobile wallet. Though Smishing is an incarnation of phishing, they differ in the information available and attack strategy. As a result, detecting Smishing becomes difficult. Numerous models and techniques to detect Smishing attacks have been introduced for high-resource languages, yet few target low-resource languages such as Swahili. This study proposes a machine-learning based model to classify Swahili Smishing text messages targeting mobile money users. Experimental results show a hybrid model of Extratree classifier feature selection and Random Forest using TFIDF (Term Frequency Inverse Document Frequency) vectorization yields the best model with an accuracy score of 99.86%. Results are measured against a baseline Multinomial Naïve-Bayes model. In addition, comparison with a set of other classic classifiers is also done. The model returns the lowest false positive and false negative of 2 and 4, respectively, with a Log-Loss of 0.04. A Swahili dataset with 32259 messages is used for performance evaluation.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3196464