Loading…

Parallelization of module network structure learning and performance tuning on SMP

As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hongshan Jiang, Chunrong Lai, Wenguang Chen, Yurong Chen, Wei Hu, Weimin Zheng, Yimin Zhang
Format:	Conference Proceeding
Language:	English
Subjects:	Bayesian methods Bioinformatics Computer science Multiprocessing systems Partitioning algorithms Scalability Speech processing Stochastic processes Text mining Yarn
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page	8 pp.
container_title
container_volume
creator	Hongshan Jiang Chunrong Lai Wenguang Chen Yurong Chen Wei Hu Weimin Zheng Yimin Zhang
description	As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.
doi_str_mv	10.1109/IPDPS.2006.1639610
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_1639610</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1639610</ieee_id><sourcerecordid>1639610</sourcerecordid><originalsourceid>FETCH-LOGICAL-i1340-72fa7ddb07442183e46b2e7da8c72ba1b0ee09dc3e9480c3b75ca4f102ff319d3</originalsourceid><addsrcrecordid>eNotkF1LwzAYhQMqOOf-gN7kD7S--WjTXMrUOZhYnF6PNHkj0bQdaYvor3fozs2BB85zcQi5YpAzBvpmXd_V25wDlDkrhS4ZnJALJrmUAIUsT8mMFQIyDqo4J4th-IBDhC60FjPyUptkYsQYfswY-o72nra9myLSDsevPn3SYUyTHaeENKJJXejeqekc3WPyfWpNZ5GO0x8-zLdP9SU58yYOuDj2nLw93L8uH7PN82q9vN1kgQkJmeLeKOcaUFJyVgmUZcNROVNZxRvDGkAE7axALSuwolGFNdIz4N4Lpp2Yk-t_b0DE3T6F1qTv3fEC8QsCOFEu</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Parallelization of module network structure learning and performance tuning on SMP</title><source>IEEE Xplore All Conference Series</source><creator>Hongshan Jiang ; Chunrong Lai ; Wenguang Chen ; Yurong Chen ; Wei Hu ; Weimin Zheng ; Yimin Zhang</creator><creatorcontrib>Hongshan Jiang ; Chunrong Lai ; Wenguang Chen ; Yurong Chen ; Wei Hu ; Weimin Zheng ; Yimin Zhang</creatorcontrib><description>As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.</description><identifier>ISSN: 1530-2075</identifier><identifier>ISBN: 1424400546</identifier><identifier>ISBN: 9781424400546</identifier><identifier>DOI: 10.1109/IPDPS.2006.1639610</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bayesian methods ; Bioinformatics ; Computer science ; Multiprocessing systems ; Partitioning algorithms ; Scalability ; Speech processing ; Stochastic processes ; Text mining ; Yarn</subject><ispartof>Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006, p.8 pp.</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1639610$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1639610$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hongshan Jiang</creatorcontrib><creatorcontrib>Chunrong Lai</creatorcontrib><creatorcontrib>Wenguang Chen</creatorcontrib><creatorcontrib>Yurong Chen</creatorcontrib><creatorcontrib>Wei Hu</creatorcontrib><creatorcontrib>Weimin Zheng</creatorcontrib><creatorcontrib>Yimin Zhang</creatorcontrib><title>Parallelization of module network structure learning and performance tuning on SMP</title><title>Proceedings 20th IEEE International Parallel & Distributed Processing Symposium</title><addtitle>IPDPS</addtitle><description>As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.</description><subject>Bayesian methods</subject><subject>Bioinformatics</subject><subject>Computer science</subject><subject>Multiprocessing systems</subject><subject>Partitioning algorithms</subject><subject>Scalability</subject><subject>Speech processing</subject><subject>Stochastic processes</subject><subject>Text mining</subject><subject>Yarn</subject><issn>1530-2075</issn><isbn>1424400546</isbn><isbn>9781424400546</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotkF1LwzAYhQMqOOf-gN7kD7S--WjTXMrUOZhYnF6PNHkj0bQdaYvor3fozs2BB85zcQi5YpAzBvpmXd_V25wDlDkrhS4ZnJALJrmUAIUsT8mMFQIyDqo4J4th-IBDhC60FjPyUptkYsQYfswY-o72nra9myLSDsevPn3SYUyTHaeENKJJXejeqekc3WPyfWpNZ5GO0x8-zLdP9SU58yYOuDj2nLw93L8uH7PN82q9vN1kgQkJmeLeKOcaUFJyVgmUZcNROVNZxRvDGkAE7axALSuwolGFNdIz4N4Lpp2Yk-t_b0DE3T6F1qTv3fEC8QsCOFEu</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Hongshan Jiang</creator><creator>Chunrong Lai</creator><creator>Wenguang Chen</creator><creator>Yurong Chen</creator><creator>Wei Hu</creator><creator>Weimin Zheng</creator><creator>Yimin Zhang</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>2006</creationdate><title>Parallelization of module network structure learning and performance tuning on SMP</title><author>Hongshan Jiang ; Chunrong Lai ; Wenguang Chen ; Yurong Chen ; Wei Hu ; Weimin Zheng ; Yimin Zhang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i1340-72fa7ddb07442183e46b2e7da8c72ba1b0ee09dc3e9480c3b75ca4f102ff319d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Bayesian methods</topic><topic>Bioinformatics</topic><topic>Computer science</topic><topic>Multiprocessing systems</topic><topic>Partitioning algorithms</topic><topic>Scalability</topic><topic>Speech processing</topic><topic>Stochastic processes</topic><topic>Text mining</topic><topic>Yarn</topic><toplevel>online_resources</toplevel><creatorcontrib>Hongshan Jiang</creatorcontrib><creatorcontrib>Chunrong Lai</creatorcontrib><creatorcontrib>Wenguang Chen</creatorcontrib><creatorcontrib>Yurong Chen</creatorcontrib><creatorcontrib>Wei Hu</creatorcontrib><creatorcontrib>Weimin Zheng</creatorcontrib><creatorcontrib>Yimin Zhang</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hongshan Jiang</au><au>Chunrong Lai</au><au>Wenguang Chen</au><au>Yurong Chen</au><au>Wei Hu</au><au>Weimin Zheng</au><au>Yimin Zhang</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Parallelization of module network structure learning and performance tuning on SMP</atitle><btitle>Proceedings 20th IEEE International Parallel & Distributed Processing Symposium</btitle><stitle>IPDPS</stitle><date>2006</date><risdate>2006</risdate><spage>8 pp.</spage><pages>8 pp.-</pages><issn>1530-2075</issn><isbn>1424400546</isbn><isbn>9781424400546</isbn><abstract>As an extension of Bayesian network, module network is an appropriate model for inferring causal network of a mass of variables from insufficient evidences. However learning such a model is still a time-consuming process. In this paper, we propose a parallel implementation of module network learning algorithm using OpenMP. We propose a static task partitioning strategy which distributes sub-search-spaces over worker threads to get the tradeoff between load-balance and software-cache-contention. To overcome performance penalties derived from shared-memory contention, we adopt several optimization techniques such as memory pre-allocation, memory alignment and static function usage. These optimizations have different patterns of influence on the sequential performance and the parallel speedup. Experiments validate the effectiveness of these optimizations. For a 2,200 nodes dataset, they enhance the parallel speedup up to 88%, together with a 2X sequential performance improvement. With resource contentions reduced, workload imbalance becomes the main hurdle to parallel scalability and the program behaviors more stable in various platforms.</abstract><pub>IEEE</pub><doi>10.1109/IPDPS.2006.1639610</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1530-2075
ispartof	Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006, p.8 pp.
issn	1530-2075
language	eng
recordid	cdi_ieee_primary_1639610
source	IEEE Xplore All Conference Series
subjects	Bayesian methods Bioinformatics Computer science Multiprocessing systems Partitioning algorithms Scalability Speech processing Stochastic processes Text mining Yarn
title	Parallelization of module network structure learning and performance tuning on SMP
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T03%3A04%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Parallelization%20of%20module%20network%20structure%20learning%20and%20performance%20tuning%20on%20SMP&rft.btitle=Proceedings%2020th%20IEEE%20International%20Parallel%20&%20Distributed%20Processing%20Symposium&rft.au=Hongshan%20Jiang&rft.date=2006&rft.spage=8%20pp.&rft.pages=8%20pp.-&rft.issn=1530-2075&rft.isbn=1424400546&rft.isbn_list=9781424400546&rft_id=info:doi/10.1109/IPDPS.2006.1639610&rft_dat=%3Cieee_CHZPO%3E1639610%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i1340-72fa7ddb07442183e46b2e7da8c72ba1b0ee09dc3e9480c3b75ca4f102ff319d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1639610&rfr_iscdi=true