Loading…
Learning Discriminative Text Representation for Streaming Social Event Detection
Event detection on social platforms can help people perceive essential events and make actionable decisions. Existing document-pivot streaming social event detection methods generally embed documents and perform text clustering. They face the challenges of constantly changing context and unknown eve...
Saved in:
Published in: | IEEE transactions on knowledge and data engineering 2023-12, Vol.35 (12), p.12295-12309 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Event detection on social platforms can help people perceive essential events and make actionable decisions. Existing document-pivot streaming social event detection methods generally embed documents and perform text clustering. They face the challenges of constantly changing context and unknown event categories and struggle by designing compound text representation methods and various similarity measures. However, phased, well-designed methods are excessively fragile and unable to utilize the potential of text representations fully. Meanwhile, their complex threshold settings result in clustering-based event detection suffering the pain of ever-changing environments. We propose a text representation learning method namely Text Sim ilarity C ontrastive L earning N eural N etwork (Text-SimCLNN) to tackle these challenges. Text-SimCLNN uses smaller parts to learn the similarity probability of text pairs from semantic and structural perspectives, effectively bridging the gap between text representation learning and similarity measure in streaming event detection. Event discovery and merging in streams can be easily performed based on the learned representations, and we use various techniques to speed up such processes. Furthermore, we introduce an online update mechanism that uses heterogeneous graphs to generate high-quality samples to enable stable and reliable inductive learning. Extensive experiments on two real-world datasets demonstrate that our method far exceeds state-of-the-art (SOTA). |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2021.3119686 |