麻豆影音

Skip to main content
SHARE
Publication

GSplit: Scaling Graph Neural Network Training on Large Graphs via Split-Parallelism...

by Seung-hwan Lim
Publication Type
Conference Paper
Book Title
Proceedings of The Eighth Annual Conference on Machine Learning and Systems
Publication Date
Page Number
273
Conference Name
The Eighth Annual Conference on Machine Learning and Systems
Conference Location
Santa Clara, California, United States of America
Conference Sponsor
Systems and Machine Learning Foundation
Conference Date

Graph neural networks (GNNs), an emerging class of machine learning models for graphs, have gained popularity for their superior performance in various graph analytical tasks. Mini-batch training is commonly used to train GNNs on large graphs, and data parallelism is the standard approach to scale mini-batch training across multiple GPUs. Data parallel approaches contain redundant work as subgraphs sampled by different GPUs contain significant overlap. To address this issue, we introduce a hybrid parallel mini-batch training paradigm called Split parallelism. Split parallelism avoids redundant work by splitting the sampling, loading, and training of each mini-batch across multiple GPUs. Split parallelism, however, introduces communication overheads that can be more than the savings from removing redundant work. We further present a lightweight partitioning algorithm that probabilistically minimizes these overheads. We implement spllit parllelism in GSplit and show that it outperforms state-of-the-art mini-batch training systems like DGL, Quiver, and P3.