Loading…

Learning Discriminative Text Representation for Streaming Social Event Detection

Event detection on social platforms can help people perceive essential events and make actionable decisions. Existing document-pivot streaming social event detection methods generally embed documents and perform text clustering. They face the challenges of constantly changing context and unknown eve...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2023-12, Vol.35 (12), p.12295-12309
Main Authors: Tong, Chaodong, Peng, Huailiang, Bai, Xu, Dai, Qiong, Zhang, Ruitong, Li, Yangyang, Xu, Hanjie, Gu, Xian-Ming
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Event detection on social platforms can help people perceive essential events and make actionable decisions. Existing document-pivot streaming social event detection methods generally embed documents and perform text clustering. They face the challenges of constantly changing context and unknown event categories and struggle by designing compound text representation methods and various similarity measures. However, phased, well-designed methods are excessively fragile and unable to utilize the potential of text representations fully. Meanwhile, their complex threshold settings result in clustering-based event detection suffering the pain of ever-changing environments. We propose a text representation learning method namely Text Sim ilarity C ontrastive L earning N eural N etwork (Text-SimCLNN) to tackle these challenges. Text-SimCLNN uses smaller parts to learn the similarity probability of text pairs from semantic and structural perspectives, effectively bridging the gap between text representation learning and similarity measure in streaming event detection. Event discovery and merging in streams can be easily performed based on the learned representations, and we use various techniques to speed up such processes. Furthermore, we introduce an online update mechanism that uses heterogeneous graphs to generate high-quality samples to enable stable and reliable inductive learning. Extensive experiments on two real-world datasets demonstrate that our method far exceeds state-of-the-art (SOTA).
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2021.3119686