Loading…

Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures

Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that c...

Full description

Saved in:
Bibliographic Details
Main Authors: Baker, A. H., Gamblin, T., Schulz, M., Yang, U. M.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 286
container_issue
container_start_page 275
container_title
container_volume
creator Baker, A. H.
Gamblin, T.
Schulz, M.
Yang, U. M.
description Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations.
doi_str_mv 10.1109/IPDPS.2011.35
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_6012844</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6012844</ieee_id><sourcerecordid>6012844</sourcerecordid><originalsourceid>FETCH-LOGICAL-i1955-83bff91389fd84deaaa8f065abc0f4d17d3540ac5e47832da9937a3330b802f03</originalsourceid><addsrcrecordid>eNotj81KAzEYACMqWGuPnrzkBXb9ki_ZJMelWi20tFA9l2x-toG1K0l78O1F6mlgDgNDyCODmjEwz8vty3ZXc2CsRnlFZkZpUI2RArVU1-SeNYxrgYqrGzJhEqHioOQdmZWSOuCNapTQckI284MdhnDsQ6FjpDtnh3TsaTv0ocs2Obo-D6fU5-Rp6_JYCl2PPuTjxbsxB9pmd0in4E7nHMoDuY12KGH2zyn5XLx-zN-r1eZtOW9XVWJGykpjF6NhqE30WvhgrdURGmk7B1F4pjxKAdbJIJRG7q0xqCwiQqeBR8Apebp0Uwhh_53Tl80_-wb-rgX-AqQEUT8</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</title><source>IEEE Xplore All Conference Series</source><creator>Baker, A. H. ; Gamblin, T. ; Schulz, M. ; Yang, U. M.</creator><creatorcontrib>Baker, A. H. ; Gamblin, T. ; Schulz, M. ; Yang, U. M.</creatorcontrib><description>Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations.</description><identifier>ISSN: 1530-2075</identifier><identifier>ISBN: 1612843727</identifier><identifier>ISBN: 9781612843728</identifier><identifier>EISBN: 9780769543857</identifier><identifier>EISBN: 0769543855</identifier><identifier>DOI: 10.1109/IPDPS.2011.35</identifier><language>eng</language><publisher>IEEE</publisher><subject>Deformable models ; Interpolation ; Laboratories ; Multicore processing ; Program processors ; Three dimensional displays</subject><ispartof>2011 IEEE International Parallel &amp; Distributed Processing Symposium, , p.275-286</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6012844$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2057,27924,54554,54919,54931</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6012844$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Baker, A. H.</creatorcontrib><creatorcontrib>Gamblin, T.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Yang, U. M.</creatorcontrib><title>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</title><title>2011 IEEE International Parallel &amp; Distributed Processing Symposium</title><addtitle>ipdps</addtitle><description>Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations.</description><subject>Deformable models</subject><subject>Interpolation</subject><subject>Laboratories</subject><subject>Multicore processing</subject><subject>Program processors</subject><subject>Three dimensional displays</subject><issn>1530-2075</issn><isbn>1612843727</isbn><isbn>9781612843728</isbn><isbn>9780769543857</isbn><isbn>0769543855</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj81KAzEYACMqWGuPnrzkBXb9ki_ZJMelWi20tFA9l2x-toG1K0l78O1F6mlgDgNDyCODmjEwz8vty3ZXc2CsRnlFZkZpUI2RArVU1-SeNYxrgYqrGzJhEqHioOQdmZWSOuCNapTQckI284MdhnDsQ6FjpDtnh3TsaTv0ocs2Obo-D6fU5-Rp6_JYCl2PPuTjxbsxB9pmd0in4E7nHMoDuY12KGH2zyn5XLx-zN-r1eZtOW9XVWJGykpjF6NhqE30WvhgrdURGmk7B1F4pjxKAdbJIJRG7q0xqCwiQqeBR8Apebp0Uwhh_53Tl80_-wb-rgX-AqQEUT8</recordid><creator>Baker, A. H.</creator><creator>Gamblin, T.</creator><creator>Schulz, M.</creator><creator>Yang, U. M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><title>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</title><author>Baker, A. H. ; Gamblin, T. ; Schulz, M. ; Yang, U. M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i1955-83bff91389fd84deaaa8f065abc0f4d17d3540ac5e47832da9937a3330b802f03</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><topic>Deformable models</topic><topic>Interpolation</topic><topic>Laboratories</topic><topic>Multicore processing</topic><topic>Program processors</topic><topic>Three dimensional displays</topic><toplevel>online_resources</toplevel><creatorcontrib>Baker, A. H.</creatorcontrib><creatorcontrib>Gamblin, T.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Yang, U. M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Baker, A. H.</au><au>Gamblin, T.</au><au>Schulz, M.</au><au>Yang, U. M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</atitle><btitle>2011 IEEE International Parallel &amp; Distributed Processing Symposium</btitle><stitle>ipdps</stitle><spage>275</spage><epage>286</epage><pages>275-286</pages><issn>1530-2075</issn><isbn>1612843727</isbn><isbn>9781612843728</isbn><eisbn>9780769543857</eisbn><eisbn>0769543855</eisbn><abstract>Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations.</abstract><pub>IEEE</pub><doi>10.1109/IPDPS.2011.35</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1530-2075
ispartof 2011 IEEE International Parallel & Distributed Processing Symposium, , p.275-286
issn 1530-2075
language eng
recordid cdi_ieee_primary_6012844
source IEEE Xplore All Conference Series
subjects Deformable models
Interpolation
Laboratories
Multicore processing
Program processors
Three dimensional displays
title Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T18%3A13%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Challenges%20of%20Scaling%20Algebraic%20Multigrid%20Across%20Modern%20Multicore%20Architectures&rft.btitle=2011%20IEEE%20International%20Parallel%20&%20Distributed%20Processing%20Symposium&rft.au=Baker,%20A.%20H.&rft.spage=275&rft.epage=286&rft.pages=275-286&rft.issn=1530-2075&rft.isbn=1612843727&rft.isbn_list=9781612843728&rft_id=info:doi/10.1109/IPDPS.2011.35&rft.eisbn=9780769543857&rft.eisbn_list=0769543855&rft_dat=%3Cieee_CHZPO%3E6012844%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i1955-83bff91389fd84deaaa8f065abc0f4d17d3540ac5e47832da9937a3330b802f03%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6012844&rfr_iscdi=true