Loading…

PGeoTopic: A Distributed Solution for Mining Geographical Topic Models

Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the eff...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2022-02, Vol.34 (2), p.881-893
Main Authors: Zhao, Kaiqi, Cong, Gao, Li, Xiucheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Geographical topic models have been used to mine geo-tagged documents for topical region and geographical topics, and also have applications in recommendations, user mobility modeling, event detection, etc. Existing studies focus on learning effective geographical topic models while ignoring the efficiency issue. However, it is very expensive to train geographical topic models - it may take days to train a geographical topic model of a small scale on a collection of documents with millions of word tokens. In this paper, we propose the first distributed solution, called {\sf PGeoTopic} PGeoTopic , for training geographical topic models. The proposed solution comprises several novel technical components to increase parallelism, reduce memory requirement, and reduce communication cost. Experiments show that our approach for mining geographical topic models is scalable with both model size and data size on distributed systems.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2020.2989142