Loading…
Parallelization of module network structure learning and performance tuning on SMP
As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | 8 pp. |
container_title | |
container_volume | |
creator | Hongshan Jiang Chunrong Lai Wenguang Chen Yurong Chen Wei Hu Weimin Zheng Yimin Zhang |
description | As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms. |
doi_str_mv | 10.1109/IPDPS.2006.1639610 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_1639610</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1639610</ieee_id><sourcerecordid>1639610</sourcerecordid><originalsourceid>FETCH-LOGICAL-i1340-72fa7ddb07442183e46b2e7da8c72ba1b0ee09dc3e9480c3b75ca4f102ff319d3</originalsourceid><addsrcrecordid>eNotkF1LwzAYhQMqOOf-gN7kD7S--WjTXMrUOZhYnF6PNHkj0bQdaYvor3fozs2BB85zcQi5YpAzBvpmXd_V25wDlDkrhS4ZnJALJrmUAIUsT8mMFQIyDqo4J4th-IBDhC60FjPyUptkYsQYfswY-o72nra9myLSDsevPn3SYUyTHaeENKJJXejeqekc3WPyfWpNZ5GO0x8-zLdP9SU58yYOuDj2nLw93L8uH7PN82q9vN1kgQkJmeLeKOcaUFJyVgmUZcNROVNZxRvDGkAE7axALSuwolGFNdIz4N4Lpp2Yk-t_b0DE3T6F1qTv3fEC8QsCOFEu</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Parallelization of module network structure learning and performance tuning on SMP</title><source>IEEE Xplore All Conference Series</source><creator>Hongshan Jiang ; Chunrong Lai ; Wenguang Chen ; Yurong Chen ; Wei Hu ; Weimin Zheng ; Yimin Zhang</creator><creatorcontrib>Hongshan Jiang ; Chunrong Lai ; Wenguang Chen ; Yurong Chen ; Wei Hu ; Weimin Zheng ; Yimin Zhang</creatorcontrib><description>As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.</description><identifier>ISSN: 1530-2075</identifier><identifier>ISBN: 1424400546</identifier><identifier>ISBN: 9781424400546</identifier><identifier>DOI: 10.1109/IPDPS.2006.1639610</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bayesian methods ; Bioinformatics ; Computer science ; Multiprocessing systems ; Partitioning algorithms ; Scalability ; Speech processing ; Stochastic processes ; Text mining ; Yarn</subject><ispartof>Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006, p.8 pp.</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1639610$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1639610$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hongshan Jiang</creatorcontrib><creatorcontrib>Chunrong Lai</creatorcontrib><creatorcontrib>Wenguang Chen</creatorcontrib><creatorcontrib>Yurong Chen</creatorcontrib><creatorcontrib>Wei Hu</creatorcontrib><creatorcontrib>Weimin Zheng</creatorcontrib><creatorcontrib>Yimin Zhang</creatorcontrib><title>Parallelization of module network structure learning and performance tuning on SMP</title><title>Proceedings 20th IEEE International Parallel & Distributed Processing Symposium</title><addtitle>IPDPS</addtitle><description>As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.</description><subject>Bayesian methods</subject><subject>Bioinformatics</subject><subject>Computer science</subject><subject>Multiprocessing systems</subject><subject>Partitioning algorithms</subject><subject>Scalability</subject><subject>Speech processing</subject><subject>Stochastic processes</subject><subject>Text mining</subject><subject>Yarn</subject><issn>1530-2075</issn><isbn>1424400546</isbn><isbn>9781424400546</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotkF1LwzAYhQMqOOf-gN7kD7S--WjTXMrUOZhYnF6PNHkj0bQdaYvor3fozs2BB85zcQi5YpAzBvpmXd_V25wDlDkrhS4ZnJALJrmUAIUsT8mMFQIyDqo4J4th-IBDhC60FjPyUptkYsQYfswY-o72nra9myLSDsevPn3SYUyTHaeENKJJXejeqekc3WPyfWpNZ5GO0x8-zLdP9SU58yYOuDj2nLw93L8uH7PN82q9vN1kgQkJmeLeKOcaUFJyVgmUZcNROVNZxRvDGkAE7axALSuwolGFNdIz4N4Lpp2Yk-t_b0DE3T6F1qTv3fEC8QsCOFEu</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Hongshan Jiang</creator><creator>Chunrong Lai</creator><creator>Wenguang Chen</creator><creator>Yurong Chen</creator><creator>Wei Hu</creator><creator>Weimin Zheng</creator><creator>Yimin Zhang</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2006</creationdate><title>Parallelization of module network structure learning and performance tuning on SMP</title><author>Hongshan Jiang ; Chunrong Lai ; Wenguang Chen ; Yurong Chen ; Wei Hu ; Weimin Zheng ; Yimin Zhang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i1340-72fa7ddb07442183e46b2e7da8c72ba1b0ee09dc3e9480c3b75ca4f102ff319d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Bayesian methods</topic><topic>Bioinformatics</topic><topic>Computer science</topic><topic>Multiprocessing systems</topic><topic>Partitioning algorithms</topic><topic>Scalability</topic><topic>Speech processing</topic><topic>Stochastic processes</topic><topic>Text mining</topic><topic>Yarn</topic><toplevel>online_resources</toplevel><creatorcontrib>Hongshan Jiang</creatorcontrib><creatorcontrib>Chunrong Lai</creatorcontrib><creatorcontrib>Wenguang Chen</creatorcontrib><creatorcontrib>Yurong Chen</creatorcontrib><creatorcontrib>Wei Hu</creatorcontrib><creatorcontrib>Weimin Zheng</creatorcontrib><creatorcontrib>Yimin Zhang</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hongshan Jiang</au><au>Chunrong Lai</au><au>Wenguang Chen</au><au>Yurong Chen</au><au>Wei Hu</au><au>Weimin Zheng</au><au>Yimin Zhang</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Parallelization of module network structure learning and performance tuning on SMP</atitle><btitle>Proceedings 20th IEEE International Parallel & Distributed Processing Symposium</btitle><stitle>IPDPS</stitle><date>2006</date><risdate>2006</risdate><spage>8 pp.</spage><pages>8 pp.-</pages><issn>1530-2075</issn><isbn>1424400546</isbn><isbn>9781424400546</isbn><abstract>As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.</abstract><pub>IEEE</pub><doi>10.1109/IPDPS.2006.1639610</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1530-2075 |
ispartof | Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006, p.8 pp. |
issn | 1530-2075 |
language | eng |
recordid | cdi_ieee_primary_1639610 |
source | IEEE Xplore All Conference Series |
subjects | Bayesian methods Bioinformatics Computer science Multiprocessing systems Partitioning algorithms Scalability Speech processing Stochastic processes Text mining Yarn |
title | Parallelization of module network structure learning and performance tuning on SMP |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T03%3A04%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Parallelization%20of%20module%20network%20structure%20learning%20and%20performance%20tuning%20on%20SMP&rft.btitle=Proceedings%2020th%20IEEE%20International%20Parallel%20&%20Distributed%20Processing%20Symposium&rft.au=Hongshan%20Jiang&rft.date=2006&rft.spage=8%20pp.&rft.pages=8%20pp.-&rft.issn=1530-2075&rft.isbn=1424400546&rft.isbn_list=9781424400546&rft_id=info:doi/10.1109/IPDPS.2006.1639610&rft_dat=%3Cieee_CHZPO%3E1639610%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i1340-72fa7ddb07442183e46b2e7da8c72ba1b0ee09dc3e9480c3b75ca4f102ff319d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1639610&rfr_iscdi=true |