Loading…

Characterizing and Understanding HGNN Training on GPUs

Owing to their remarkable representation capabilities for heterogeneous graph data, Heterogeneous Graph Neural Networks (HGNNs) have been widely adopted in many critical real-world domains such as recommendation systems and medical analysis. Prior to their practical application, identifying the opti...

Full description

Saved in:

Bibliographic Details
Published in:	ACM transactions on architecture and code optimization 2024-11
Main Authors:	Han, Dengke, Yan, Mingyu, Ye, Xiaochun, Fan, Dongrui
Format:	Article
Language:	English
Subjects:	Architectures Artificial intelligence Computer systems organization Computing methodologies General and reference General literature Graph algorithms analysis Theory of computation
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Owing to their remarkable representation capabilities for heterogeneous graph data, Heterogeneous Graph Neural Networks (HGNNs) have been widely adopted in many critical real-world domains such as recommendation systems and medical analysis. Prior to their practical application, identifying the optimal HGNN model parameters tailored to specific tasks through extensive training is a time-consuming and costly process. To enhance the efficiency of HGNN training, it is essential to characterize and analyze the execution semantics and patterns within the training process to identify performance bottlenecks. In this study, we conduct a comprehensive quantification and in-depth analysis of two mainstream HGNN training scenarios, including single-GPU and multi-GPU distributed training. Based on the characterization results, we reveal the performance bottlenecks and their underlying causes in different HGNN training scenarios and propose optimization guidelines from both software and hardware perspectives.
ISSN:	1544-3566 1544-3973
DOI:	10.1145/3703356