Loading…
A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters
We present a hybrid CUDA-MPI sorting algorithm that makes use of GPU clusters to sort large data sets. Our algorithm has two phases. In the first phase each node sorts a portion of the data on its GPU using a parallel bitonic sort. In the second phase the sorted subsequences are merged together in p...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 589 |
container_issue | |
container_start_page | 588 |
container_title | |
container_volume | |
creator | White, S. Verosky, N. Newhall, T. |
description | We present a hybrid CUDA-MPI sorting algorithm that makes use of GPU clusters to sort large data sets. Our algorithm has two phases. In the first phase each node sorts a portion of the data on its GPU using a parallel bitonic sort. In the second phase the sorted subsequences are merged together in parallel using a reduction sorting network implemented in MPI across the cluster nodes. Performance results comparing our sorting algorithm to sequential quick sort yield speed-up values of up to 9.8 for sorting 4GB of data on a 32 node GPU cluster. We anticipate even better speed-up values using our algorithm on larger data sets and larger sized clusters. |
doi_str_mv | 10.1109/ICPPW.2012.82 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6337530</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6337530</ieee_id><sourcerecordid>6337530</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-e68c76b8f8d17297a6334d0540011f5961d69bec68694bed91719d8ced6b85973</originalsourceid><addsrcrecordid>eNotjEtLw0AUhccXWGqXrtzMH0i9dybzuOAmptoGKgZscVmazKSOtI1M4qL_3qAeOJzFx_kYu0WYIgLdF3lZvk8FoJhaccYmZCwYTSo1Q8_ZSEgpEqUJLn4ZptpIoYDMJRsBEiSS0F6zSdd9whArUFocsYeM5-tZlryUBV-cqhgcfwx9eww1f2tjH447nu13bQz9x4E3beTzcs3z_XfX-9jdsKtmu-_85H_HbPX8tMoXyfJ1XuTZMgkEfeK1rY2ubGMdGkFmq6VMHagUALFRpNFpqnytraa08o7QIDlbezecFBk5Znd_2uC933zFcNjG02awGCVB_gDLvEnm</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>White, S. ; Verosky, N. ; Newhall, T.</creator><creatorcontrib>White, S. ; Verosky, N. ; Newhall, T.</creatorcontrib><description>We present a hybrid CUDA-MPI sorting algorithm that makes use of GPU clusters to sort large data sets. Our algorithm has two phases. In the first phase each node sorts a portion of the data on its GPU using a parallel bitonic sort. In the second phase the sorted subsequences are merged together in parallel using a reduction sorting network implemented in MPI across the cluster nodes. Performance results comparing our sorting algorithm to sequential quick sort yield speed-up values of up to 9.8 for sorting 4GB of data on a 32 node GPU cluster. We anticipate even better speed-up values using our algorithm on larger data sets and larger sized clusters.</description><identifier>ISSN: 0190-3918</identifier><identifier>ISBN: 9781467325097</identifier><identifier>ISBN: 1467325090</identifier><identifier>EISSN: 2332-5690</identifier><identifier>EISBN: 9780769547954</identifier><identifier>EISBN: 0769547958</identifier><identifier>DOI: 10.1109/ICPPW.2012.82</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Clustering algorithms ; GPU clusters ; Graphics processing unit ; hybrid CUDA-MPI ; Parallel processing ; parallel sorting algorithm ; Random access memory ; Runtime ; Sorting</subject><ispartof>2012 41st International Conference on Parallel Processing Workshops, 2012, p.588-589</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6337530$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27902,54895</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6337530$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>White, S.</creatorcontrib><creatorcontrib>Verosky, N.</creatorcontrib><creatorcontrib>Newhall, T.</creatorcontrib><title>A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters</title><title>2012 41st International Conference on Parallel Processing Workshops</title><addtitle>icppw</addtitle><description>We present a hybrid CUDA-MPI sorting algorithm that makes use of GPU clusters to sort large data sets. Our algorithm has two phases. In the first phase each node sorts a portion of the data on its GPU using a parallel bitonic sort. In the second phase the sorted subsequences are merged together in parallel using a reduction sorting network implemented in MPI across the cluster nodes. Performance results comparing our sorting algorithm to sequential quick sort yield speed-up values of up to 9.8 for sorting 4GB of data on a 32 node GPU cluster. We anticipate even better speed-up values using our algorithm on larger data sets and larger sized clusters.</description><subject>Algorithm design and analysis</subject><subject>Clustering algorithms</subject><subject>GPU clusters</subject><subject>Graphics processing unit</subject><subject>hybrid CUDA-MPI</subject><subject>Parallel processing</subject><subject>parallel sorting algorithm</subject><subject>Random access memory</subject><subject>Runtime</subject><subject>Sorting</subject><issn>0190-3918</issn><issn>2332-5690</issn><isbn>9781467325097</isbn><isbn>1467325090</isbn><isbn>9780769547954</isbn><isbn>0769547958</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjEtLw0AUhccXWGqXrtzMH0i9dybzuOAmptoGKgZscVmazKSOtI1M4qL_3qAeOJzFx_kYu0WYIgLdF3lZvk8FoJhaccYmZCwYTSo1Q8_ZSEgpEqUJLn4ZptpIoYDMJRsBEiSS0F6zSdd9whArUFocsYeM5-tZlryUBV-cqhgcfwx9eww1f2tjH447nu13bQz9x4E3beTzcs3z_XfX-9jdsKtmu-_85H_HbPX8tMoXyfJ1XuTZMgkEfeK1rY2ubGMdGkFmq6VMHagUALFRpNFpqnytraa08o7QIDlbezecFBk5Znd_2uC933zFcNjG02awGCVB_gDLvEnm</recordid><startdate>201209</startdate><enddate>201209</enddate><creator>White, S.</creator><creator>Verosky, N.</creator><creator>Newhall, T.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201209</creationdate><title>A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters</title><author>White, S. ; Verosky, N. ; Newhall, T.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-e68c76b8f8d17297a6334d0540011f5961d69bec68694bed91719d8ced6b85973</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithm design and analysis</topic><topic>Clustering algorithms</topic><topic>GPU clusters</topic><topic>Graphics processing unit</topic><topic>hybrid CUDA-MPI</topic><topic>Parallel processing</topic><topic>parallel sorting algorithm</topic><topic>Random access memory</topic><topic>Runtime</topic><topic>Sorting</topic><toplevel>online_resources</toplevel><creatorcontrib>White, S.</creatorcontrib><creatorcontrib>Verosky, N.</creatorcontrib><creatorcontrib>Newhall, T.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>White, S.</au><au>Verosky, N.</au><au>Newhall, T.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters</atitle><btitle>2012 41st International Conference on Parallel Processing Workshops</btitle><stitle>icppw</stitle><date>2012-09</date><risdate>2012</risdate><spage>588</spage><epage>589</epage><pages>588-589</pages><issn>0190-3918</issn><eissn>2332-5690</eissn><isbn>9781467325097</isbn><isbn>1467325090</isbn><eisbn>9780769547954</eisbn><eisbn>0769547958</eisbn><coden>IEEPAD</coden><abstract>We present a hybrid CUDA-MPI sorting algorithm that makes use of GPU clusters to sort large data sets. Our algorithm has two phases. In the first phase each node sorts a portion of the data on its GPU using a parallel bitonic sort. In the second phase the sorted subsequences are merged together in parallel using a reduction sorting network implemented in MPI across the cluster nodes. Performance results comparing our sorting algorithm to sequential quick sort yield speed-up values of up to 9.8 for sorting 4GB of data on a 32 node GPU cluster. We anticipate even better speed-up values using our algorithm on larger data sets and larger sized clusters.</abstract><pub>IEEE</pub><doi>10.1109/ICPPW.2012.82</doi><tpages>2</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0190-3918 |
ispartof | 2012 41st International Conference on Parallel Processing Workshops, 2012, p.588-589 |
issn | 0190-3918 2332-5690 |
language | eng |
recordid | cdi_ieee_primary_6337530 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Algorithm design and analysis Clustering algorithms GPU clusters Graphics processing unit hybrid CUDA-MPI Parallel processing parallel sorting algorithm Random access memory Runtime Sorting |
title | A CUDA-MPI Hybrid Bitonic Sorting Algorithm for GPU Clusters |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T03%3A00%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20CUDA-MPI%20Hybrid%20Bitonic%20Sorting%20Algorithm%20for%20GPU%20Clusters&rft.btitle=2012%2041st%20International%20Conference%20on%20Parallel%20Processing%20Workshops&rft.au=White,%20S.&rft.date=2012-09&rft.spage=588&rft.epage=589&rft.pages=588-589&rft.issn=0190-3918&rft.eissn=2332-5690&rft.isbn=9781467325097&rft.isbn_list=1467325090&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ICPPW.2012.82&rft.eisbn=9780769547954&rft.eisbn_list=0769547958&rft_dat=%3Cieee_6IE%3E6337530%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i90t-e68c76b8f8d17297a6334d0540011f5961d69bec68694bed91719d8ced6b85973%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6337530&rfr_iscdi=true |