Loading…

Scalable decoupling graph neural network with feature-oriented optimization

Recent advances in data processing have stimulated the demand for learning graphs of very large scales. Graph neural networks (GNNs), being an emerging and powerful approach in solving graph learning tasks, are known to be difficult to scale up. Most scalable models apply node-based techniques in si...

Full description

Saved in:
Bibliographic Details
Published in:The VLDB journal 2024-05, Vol.33 (3), p.667-683
Main Authors: Liao, Ningyi, Mo, Dingheng, Luo, Siqiang, Li, Xiang, Yin, Pengcheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recent advances in data processing have stimulated the demand for learning graphs of very large scales. Graph neural networks (GNNs), being an emerging and powerful approach in solving graph learning tasks, are known to be difficult to scale up. Most scalable models apply node-based techniques in simplifying the expensive graph message-passing propagation procedure of GNNs. However, we find such acceleration insufficient when applied to million- or even billion-scale graphs. In this work, we propose SCARA , a scalable GNN with feature-oriented optimization for graph computation. SCARA efficiently computes graph embedding from the dimension of node features, and further selects and reuses feature computation results to reduce overhead. Theoretical analysis indicates that our model achieves sub-linear time complexity with a guaranteed precision in propagation process as well as GNN training and inference. We conduct extensive experiments on various datasets to evaluate the efficacy and efficiency of SCARA . Performance comparison with baselines shows that SCARA can reach up to 800 × graph propagation acceleration than current state-of-the-art methods with fast convergence and comparable accuracy. Most notably, it is efficient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111 M nodes, 1.6 B edges) in 13 s.
ISSN:1066-8888
0949-877X
DOI:10.1007/s00778-023-00829-6