Loading…

Distributed edge partitioning for trillion-edge graphs

We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructe...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the VLDB Endowment 2019-09, Vol.12 (13), p.2379-2392
Main Authors: Hanai, Masatoshi, Suzumura, Toyotaro, Tan, Wen Jun, Liu, Elvis, Theodoropoulos, Georgios, Cai, Wentong
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c173t-68f2626735cfb907a28e0e2ef12a099a764f82fab308675679742c6a4d8cf6ec3
cites cdi_FETCH-LOGICAL-c173t-68f2626735cfb907a28e0e2ef12a099a764f82fab308675679742c6a4d8cf6ec3
container_end_page 2392
container_issue 13
container_start_page 2379
container_title Proceedings of the VLDB Endowment
container_volume 12
creator Hanai, Masatoshi
Suzumura, Toyotaro
Tan, Wen Jun
Liu, Elvis
Theodoropoulos, Georgios
Cai, Wentong
description We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increase of the vertex cuts becomes local minimal. We theoretically prove that the proposed method has the upper bound in the partitioning quality. The empirical evaluation with various graphs shows that the proposed method produces higher-quality partitions than the state-of-the-art distributed graph partitioning algorithms. The performance evaluation shows that the space efficiency of the proposed method is an order-of-magnitude better than the existing algorithms, keeping its time efficiency comparable. As a result, Distributed NE can partition a trillion-edge graph using only 256 machines within 70 minutes.
doi_str_mv 10.14778/3358701.3358706
format article
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_14778_3358701_3358706</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_14778_3358701_3358706</sourcerecordid><originalsourceid>FETCH-LOGICAL-c173t-68f2626735cfb907a28e0e2ef12a099a764f82fab308675679742c6a4d8cf6ec3</originalsourceid><addsrcrecordid>eNpNj01LxDAURYMoOI7uXfYPZHxJ2veSpYyfMOBG1yVNkxqp05LEhf_eYaYLV-deDly4jN0K2IiaSN8p1WgCsTkRz9hKiga4BkPn__Ilu8r5CwA1Cr1i-BBzSbH7Kb6vfD_4arapxBKnfdwPVZhSddDjeOj8qIdk5898zS6CHbO_WbhmH0-P79sXvnt7ft3e77gTpApHHSRKJNW40BkgK7UHL30Q0oIxlrAOWgbbKdBIDZKhWjq0da9dQO_UmsFp16Up5-RDO6f4bdNvK6A9_m6X3wtR_QFlfUp0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Distributed edge partitioning for trillion-edge graphs</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Hanai, Masatoshi ; Suzumura, Toyotaro ; Tan, Wen Jun ; Liu, Elvis ; Theodoropoulos, Georgios ; Cai, Wentong</creator><creatorcontrib>Hanai, Masatoshi ; Suzumura, Toyotaro ; Tan, Wen Jun ; Liu, Elvis ; Theodoropoulos, Georgios ; Cai, Wentong</creatorcontrib><description>We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increase of the vertex cuts becomes local minimal. We theoretically prove that the proposed method has the upper bound in the partitioning quality. The empirical evaluation with various graphs shows that the proposed method produces higher-quality partitions than the state-of-the-art distributed graph partitioning algorithms. The performance evaluation shows that the space efficiency of the proposed method is an order-of-magnitude better than the existing algorithms, keeping its time efficiency comparable. As a result, Distributed NE can partition a trillion-edge graph using only 256 machines within 70 minutes.</description><identifier>ISSN: 2150-8097</identifier><identifier>EISSN: 2150-8097</identifier><identifier>DOI: 10.14778/3358701.3358706</identifier><language>eng</language><ispartof>Proceedings of the VLDB Endowment, 2019-09, Vol.12 (13), p.2379-2392</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c173t-68f2626735cfb907a28e0e2ef12a099a764f82fab308675679742c6a4d8cf6ec3</citedby><cites>FETCH-LOGICAL-c173t-68f2626735cfb907a28e0e2ef12a099a764f82fab308675679742c6a4d8cf6ec3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids></links><search><creatorcontrib>Hanai, Masatoshi</creatorcontrib><creatorcontrib>Suzumura, Toyotaro</creatorcontrib><creatorcontrib>Tan, Wen Jun</creatorcontrib><creatorcontrib>Liu, Elvis</creatorcontrib><creatorcontrib>Theodoropoulos, Georgios</creatorcontrib><creatorcontrib>Cai, Wentong</creatorcontrib><title>Distributed edge partitioning for trillion-edge graphs</title><title>Proceedings of the VLDB Endowment</title><description>We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increase of the vertex cuts becomes local minimal. We theoretically prove that the proposed method has the upper bound in the partitioning quality. The empirical evaluation with various graphs shows that the proposed method produces higher-quality partitions than the state-of-the-art distributed graph partitioning algorithms. The performance evaluation shows that the space efficiency of the proposed method is an order-of-magnitude better than the existing algorithms, keeping its time efficiency comparable. As a result, Distributed NE can partition a trillion-edge graph using only 256 machines within 70 minutes.</description><issn>2150-8097</issn><issn>2150-8097</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNpNj01LxDAURYMoOI7uXfYPZHxJ2veSpYyfMOBG1yVNkxqp05LEhf_eYaYLV-deDly4jN0K2IiaSN8p1WgCsTkRz9hKiga4BkPn__Ilu8r5CwA1Cr1i-BBzSbH7Kb6vfD_4arapxBKnfdwPVZhSddDjeOj8qIdk5898zS6CHbO_WbhmH0-P79sXvnt7ft3e77gTpApHHSRKJNW40BkgK7UHL30Q0oIxlrAOWgbbKdBIDZKhWjq0da9dQO_UmsFp16Up5-RDO6f4bdNvK6A9_m6X3wtR_QFlfUp0</recordid><startdate>201909</startdate><enddate>201909</enddate><creator>Hanai, Masatoshi</creator><creator>Suzumura, Toyotaro</creator><creator>Tan, Wen Jun</creator><creator>Liu, Elvis</creator><creator>Theodoropoulos, Georgios</creator><creator>Cai, Wentong</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201909</creationdate><title>Distributed edge partitioning for trillion-edge graphs</title><author>Hanai, Masatoshi ; Suzumura, Toyotaro ; Tan, Wen Jun ; Liu, Elvis ; Theodoropoulos, Georgios ; Cai, Wentong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c173t-68f2626735cfb907a28e0e2ef12a099a764f82fab308675679742c6a4d8cf6ec3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hanai, Masatoshi</creatorcontrib><creatorcontrib>Suzumura, Toyotaro</creatorcontrib><creatorcontrib>Tan, Wen Jun</creatorcontrib><creatorcontrib>Liu, Elvis</creatorcontrib><creatorcontrib>Theodoropoulos, Georgios</creatorcontrib><creatorcontrib>Cai, Wentong</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of the VLDB Endowment</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hanai, Masatoshi</au><au>Suzumura, Toyotaro</au><au>Tan, Wen Jun</au><au>Liu, Elvis</au><au>Theodoropoulos, Georgios</au><au>Cai, Wentong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Distributed edge partitioning for trillion-edge graphs</atitle><jtitle>Proceedings of the VLDB Endowment</jtitle><date>2019-09</date><risdate>2019</risdate><volume>12</volume><issue>13</issue><spage>2379</spage><epage>2392</epage><pages>2379-2392</pages><issn>2150-8097</issn><eissn>2150-8097</eissn><abstract>We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increase of the vertex cuts becomes local minimal. We theoretically prove that the proposed method has the upper bound in the partitioning quality. The empirical evaluation with various graphs shows that the proposed method produces higher-quality partitions than the state-of-the-art distributed graph partitioning algorithms. The performance evaluation shows that the space efficiency of the proposed method is an order-of-magnitude better than the existing algorithms, keeping its time efficiency comparable. As a result, Distributed NE can partition a trillion-edge graph using only 256 machines within 70 minutes.</abstract><doi>10.14778/3358701.3358706</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 2150-8097
ispartof Proceedings of the VLDB Endowment, 2019-09, Vol.12 (13), p.2379-2392
issn 2150-8097
2150-8097
language eng
recordid cdi_crossref_primary_10_14778_3358701_3358706
source Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
title Distributed edge partitioning for trillion-edge graphs
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T16%3A40%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Distributed%20edge%20partitioning%20for%20trillion-edge%20graphs&rft.jtitle=Proceedings%20of%20the%20VLDB%20Endowment&rft.au=Hanai,%20Masatoshi&rft.date=2019-09&rft.volume=12&rft.issue=13&rft.spage=2379&rft.epage=2392&rft.pages=2379-2392&rft.issn=2150-8097&rft.eissn=2150-8097&rft_id=info:doi/10.14778/3358701.3358706&rft_dat=%3Ccrossref%3E10_14778_3358701_3358706%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c173t-68f2626735cfb907a28e0e2ef12a099a764f82fab308675679742c6a4d8cf6ec3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true