Loading…
Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures
Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that c...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 286 |
container_issue | |
container_start_page | 275 |
container_title | |
container_volume | |
creator | Baker, A. H. Gamblin, T. Schulz, M. Yang, U. M. |
description | Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations. |
doi_str_mv | 10.1109/IPDPS.2011.35 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_6012844</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6012844</ieee_id><sourcerecordid>6012844</sourcerecordid><originalsourceid>FETCH-LOGICAL-i1955-83bff91389fd84deaaa8f065abc0f4d17d3540ac5e47832da9937a3330b802f03</originalsourceid><addsrcrecordid>eNotj81KAzEYACMqWGuPnrzkBXb9ki_ZJMelWi20tFA9l2x-toG1K0l78O1F6mlgDgNDyCODmjEwz8vty3ZXc2CsRnlFZkZpUI2RArVU1-SeNYxrgYqrGzJhEqHioOQdmZWSOuCNapTQckI284MdhnDsQ6FjpDtnh3TsaTv0ocs2Obo-D6fU5-Rp6_JYCl2PPuTjxbsxB9pmd0in4E7nHMoDuY12KGH2zyn5XLx-zN-r1eZtOW9XVWJGykpjF6NhqE30WvhgrdURGmk7B1F4pjxKAdbJIJRG7q0xqCwiQqeBR8Apebp0Uwhh_53Tl80_-wb-rgX-AqQEUT8</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</title><source>IEEE Xplore All Conference Series</source><creator>Baker, A. H. ; Gamblin, T. ; Schulz, M. ; Yang, U. M.</creator><creatorcontrib>Baker, A. H. ; Gamblin, T. ; Schulz, M. ; Yang, U. M.</creatorcontrib><description>Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations.</description><identifier>ISSN: 1530-2075</identifier><identifier>ISBN: 1612843727</identifier><identifier>ISBN: 9781612843728</identifier><identifier>EISBN: 9780769543857</identifier><identifier>EISBN: 0769543855</identifier><identifier>DOI: 10.1109/IPDPS.2011.35</identifier><language>eng</language><publisher>IEEE</publisher><subject>Deformable models ; Interpolation ; Laboratories ; Multicore processing ; Program processors ; Three dimensional displays</subject><ispartof>2011 IEEE International Parallel & Distributed Processing Symposium, , p.275-286</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6012844$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2057,27924,54554,54919,54931</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6012844$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Baker, A. H.</creatorcontrib><creatorcontrib>Gamblin, T.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Yang, U. M.</creatorcontrib><title>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</title><title>2011 IEEE International Parallel & Distributed Processing Symposium</title><addtitle>ipdps</addtitle><description>Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations.</description><subject>Deformable models</subject><subject>Interpolation</subject><subject>Laboratories</subject><subject>Multicore processing</subject><subject>Program processors</subject><subject>Three dimensional displays</subject><issn>1530-2075</issn><isbn>1612843727</isbn><isbn>9781612843728</isbn><isbn>9780769543857</isbn><isbn>0769543855</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj81KAzEYACMqWGuPnrzkBXb9ki_ZJMelWi20tFA9l2x-toG1K0l78O1F6mlgDgNDyCODmjEwz8vty3ZXc2CsRnlFZkZpUI2RArVU1-SeNYxrgYqrGzJhEqHioOQdmZWSOuCNapTQckI284MdhnDsQ6FjpDtnh3TsaTv0ocs2Obo-D6fU5-Rp6_JYCl2PPuTjxbsxB9pmd0in4E7nHMoDuY12KGH2zyn5XLx-zN-r1eZtOW9XVWJGykpjF6NhqE30WvhgrdURGmk7B1F4pjxKAdbJIJRG7q0xqCwiQqeBR8Apebp0Uwhh_53Tl80_-wb-rgX-AqQEUT8</recordid><creator>Baker, A. H.</creator><creator>Gamblin, T.</creator><creator>Schulz, M.</creator><creator>Yang, U. M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><title>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</title><author>Baker, A. H. ; Gamblin, T. ; Schulz, M. ; Yang, U. M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i1955-83bff91389fd84deaaa8f065abc0f4d17d3540ac5e47832da9937a3330b802f03</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><topic>Deformable models</topic><topic>Interpolation</topic><topic>Laboratories</topic><topic>Multicore processing</topic><topic>Program processors</topic><topic>Three dimensional displays</topic><toplevel>online_resources</toplevel><creatorcontrib>Baker, A. H.</creatorcontrib><creatorcontrib>Gamblin, T.</creatorcontrib><creatorcontrib>Schulz, M.</creatorcontrib><creatorcontrib>Yang, U. M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Baker, A. H.</au><au>Gamblin, T.</au><au>Schulz, M.</au><au>Yang, U. M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures</atitle><btitle>2011 IEEE International Parallel & Distributed Processing Symposium</btitle><stitle>ipdps</stitle><spage>275</spage><epage>286</epage><pages>275-286</pages><issn>1530-2075</issn><isbn>1612843727</isbn><isbn>9781612843728</isbn><eisbn>9780769543857</eisbn><eisbn>0769543855</eisbn><abstract>Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential component of many simulation codes. AMG has shown to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore architectures, we face new challenges that can significantly deteriorate AMG's performance. We examine its performance and scalability on three disparate multicore architectures: a cluster with four AMD Opteron Quad-core processors per node (Hera), a Cray XT5 with two AMD Opteron Hex-core processors per node (Jaguar), and an IBM Blue Gene/P system with a single Quad-core processor (Intrepid). We discuss our experiences on these platforms and present results using both an MPI-only and a hybrid MPI/OpenMP model. We also discuss a set of techniques that helped to overcome the associated problems, including thread and process pinning and correct memory associations.</abstract><pub>IEEE</pub><doi>10.1109/IPDPS.2011.35</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1530-2075 |
ispartof | 2011 IEEE International Parallel & Distributed Processing Symposium, , p.275-286 |
issn | 1530-2075 |
language | eng |
recordid | cdi_ieee_primary_6012844 |
source | IEEE Xplore All Conference Series |
subjects | Deformable models Interpolation Laboratories Multicore processing Program processors Three dimensional displays |
title | Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T18%3A13%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Challenges%20of%20Scaling%20Algebraic%20Multigrid%20Across%20Modern%20Multicore%20Architectures&rft.btitle=2011%20IEEE%20International%20Parallel%20&%20Distributed%20Processing%20Symposium&rft.au=Baker,%20A.%20H.&rft.spage=275&rft.epage=286&rft.pages=275-286&rft.issn=1530-2075&rft.isbn=1612843727&rft.isbn_list=9781612843728&rft_id=info:doi/10.1109/IPDPS.2011.35&rft.eisbn=9780769543857&rft.eisbn_list=0769543855&rft_dat=%3Cieee_CHZPO%3E6012844%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i1955-83bff91389fd84deaaa8f065abc0f4d17d3540ac5e47832da9937a3330b802f03%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6012844&rfr_iscdi=true |