Loading…

ACTSSD: social spammer detection based on active learning and co-training

The rumors, advertisements and malicious links are spread in social networks by social spammers, which affect users’ normal access to social networks and cause security problems. Most methods aim to detect social spammers by various features, such as content features, behavior features and relations...

Full description

Saved in:

Bibliographic Details
Published in:	The Journal of supercomputing 2022-02, Vol.78 (2), p.2744-2771
Main Authors:	Chen, Ailin, Yang, Pin, Cheng, Pengsen
Format:	Article
Language:	English
Subjects:	Active learning Algorithms Compilers Computer Science Datasets Interpreters Labels Machine learning Processor Architectures Programming Languages Social networks
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The rumors, advertisements and malicious links are spread in social networks by social spammers, which affect users’ normal access to social networks and cause security problems. Most methods aim to detect social spammers by various features, such as content features, behavior features and relationship graph features, which rely on a large-scale labeled data. However, labeled data are lacking for training in real world, and manual annotating is time-consuming and labor-intensive. To solve this problem, we propose a novel method which combines active learning algorithm with co-training algorithm to make full use of unlabeled data. In co-training, user features are divided into two views without overlap. Classifiers are trained iteratively with labeled instances and the most confident unlabeled instances with pseudo-labels. In active learning, the most representative and uncertain instances are selected and annotated with real labels to extend labeled dataset. Experimental results on the Twitter and Apontador datasets show that our method can effectively detect social spammers in the case of limited labeled data.
ISSN:	0920-8542 1573-0484
DOI:	10.1007/s11227-021-03966-3