Loading…

Distributed pagerank for P2P systems

This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables increm...

Full description

Saved in:
Bibliographic Details
Main Authors: Sankaralingam, K., Sethumadhavan, S., Browne, J.C.
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c254t-284268a8012eff6bfe48277421fdb2d3d2dc968e4f65655c0a56bda4e25824993
cites
container_end_page 68
container_issue
container_start_page 58
container_title
container_volume
creator Sankaralingam, K.
Sethumadhavan, S.
Browne, J.C.
description This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables incremental computation of pageranks as new documents are entered into or deleted from the network. Incremental update enables continuously accurate pageranks whereas the currently centralized web crawl and computation over Internet documents requires several days. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized Web crawler based implementation for Internet documents. A complete solution of the distributed pagerank computation for an in-place network converges rapidly (1% accuracy in 10 iterations) for large systems although the time for iteration may be long. The incremental computation resulting from addition of a single document converges extremely rapidly, typically requiring update path lengths of fewer than 15 nodes even for large networks and very accurate solutions. This implementation of pagerank provides a uniform ranking scheme for documents in P2P systems, and its integration with P2P keyword search provides one solution to the network traffic problems engendered by return of document hits. In basic P2P keyword search, all the document hits must be returned to the querying node causing large network traffic. An incremental keyword search algorithm for P2P keyword search where document hits are sorted by pagerank, and incrementally returned to the querying node is proposed and evaluated. Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic. The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.
doi_str_mv 10.1109/HPDC.2003.1210016
format conference_proceeding
fullrecord <record><control><sourceid>proquest_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_1210016</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1210016</ieee_id><sourcerecordid>31150428</sourcerecordid><originalsourceid>FETCH-LOGICAL-c254t-284268a8012eff6bfe48277421fdb2d3d2dc968e4f65655c0a56bda4e25824993</originalsourceid><addsrcrecordid>eNotj71OwzAURi0BEqX0ARBLBsSWcH39E3tELVCkSmSAOXLiaxRImhAnQ9-eSu1ZvuXokw5jdxwyzsE-bYvNOkMAkXHkAFxfsBvItVXcaoWXbMHBYGos5NdsFeMPHFHKiFwu2MOmidPYVPNEPhncN41u_5uEfkwKLJJ4iBN18ZZdBddGWp13yb5eXz7X23T38fa-ft6lNSo5pWgkauMMcKQQdBVIGsxziTz4Cr3w6GurDcmglVaqBqd05Z0kVAaltWLJHk-_w9j_zRSnsmtiTW3r9tTPsRScK5BojuL9SWyIqBzGpnPjoTzXi3_ZZ0ua</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>31150428</pqid></control><display><type>conference_proceeding</type><title>Distributed pagerank for P2P systems</title><source>IEEE Xplore All Conference Series</source><creator>Sankaralingam, K. ; Sethumadhavan, S. ; Browne, J.C.</creator><creatorcontrib>Sankaralingam, K. ; Sethumadhavan, S. ; Browne, J.C.</creatorcontrib><description>This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables incremental computation of pageranks as new documents are entered into or deleted from the network. Incremental update enables continuously accurate pageranks whereas the currently centralized web crawl and computation over Internet documents requires several days. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized Web crawler based implementation for Internet documents. A complete solution of the distributed pagerank computation for an in-place network converges rapidly (1% accuracy in 10 iterations) for large systems although the time for iteration may be long. The incremental computation resulting from addition of a single document converges extremely rapidly, typically requiring update path lengths of fewer than 15 nodes even for large networks and very accurate solutions. This implementation of pagerank provides a uniform ranking scheme for documents in P2P systems, and its integration with P2P keyword search provides one solution to the network traffic problems engendered by return of document hits. In basic P2P keyword search, all the document hits must be returned to the querying node causing large network traffic. An incremental keyword search algorithm for P2P keyword search where document hits are sorted by pagerank, and incrementally returned to the querying node is proposed and evaluated. Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic. The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.</description><identifier>ISSN: 1082-8907</identifier><identifier>ISBN: 0769519652</identifier><identifier>ISBN: 9780769519654</identifier><identifier>DOI: 10.1109/HPDC.2003.1210016</identifier><language>eng</language><publisher>IEEE</publisher><subject>Chaos ; Computer networks ; Distributed algorithms ; Distributed computing ; Internet ; Iterative algorithms ; Keyword search ; Linear systems ; Peer to peer computing ; Telecommunication traffic</subject><ispartof>High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on, 2003, p.58-68</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c254t-284268a8012eff6bfe48277421fdb2d3d2dc968e4f65655c0a56bda4e25824993</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1210016$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1210016$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Sankaralingam, K.</creatorcontrib><creatorcontrib>Sethumadhavan, S.</creatorcontrib><creatorcontrib>Browne, J.C.</creatorcontrib><title>Distributed pagerank for P2P systems</title><title>High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on</title><addtitle>HPDC</addtitle><description>This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables incremental computation of pageranks as new documents are entered into or deleted from the network. Incremental update enables continuously accurate pageranks whereas the currently centralized web crawl and computation over Internet documents requires several days. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized Web crawler based implementation for Internet documents. A complete solution of the distributed pagerank computation for an in-place network converges rapidly (1% accuracy in 10 iterations) for large systems although the time for iteration may be long. The incremental computation resulting from addition of a single document converges extremely rapidly, typically requiring update path lengths of fewer than 15 nodes even for large networks and very accurate solutions. This implementation of pagerank provides a uniform ranking scheme for documents in P2P systems, and its integration with P2P keyword search provides one solution to the network traffic problems engendered by return of document hits. In basic P2P keyword search, all the document hits must be returned to the querying node causing large network traffic. An incremental keyword search algorithm for P2P keyword search where document hits are sorted by pagerank, and incrementally returned to the querying node is proposed and evaluated. Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic. The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.</description><subject>Chaos</subject><subject>Computer networks</subject><subject>Distributed algorithms</subject><subject>Distributed computing</subject><subject>Internet</subject><subject>Iterative algorithms</subject><subject>Keyword search</subject><subject>Linear systems</subject><subject>Peer to peer computing</subject><subject>Telecommunication traffic</subject><issn>1082-8907</issn><isbn>0769519652</isbn><isbn>9780769519654</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2003</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj71OwzAURi0BEqX0ARBLBsSWcH39E3tELVCkSmSAOXLiaxRImhAnQ9-eSu1ZvuXokw5jdxwyzsE-bYvNOkMAkXHkAFxfsBvItVXcaoWXbMHBYGos5NdsFeMPHFHKiFwu2MOmidPYVPNEPhncN41u_5uEfkwKLJJ4iBN18ZZdBddGWp13yb5eXz7X23T38fa-ft6lNSo5pWgkauMMcKQQdBVIGsxziTz4Cr3w6GurDcmglVaqBqd05Z0kVAaltWLJHk-_w9j_zRSnsmtiTW3r9tTPsRScK5BojuL9SWyIqBzGpnPjoTzXi3_ZZ0ua</recordid><startdate>2003</startdate><enddate>2003</enddate><creator>Sankaralingam, K.</creator><creator>Sethumadhavan, S.</creator><creator>Browne, J.C.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>2003</creationdate><title>Distributed pagerank for P2P systems</title><author>Sankaralingam, K. ; Sethumadhavan, S. ; Browne, J.C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c254t-284268a8012eff6bfe48277421fdb2d3d2dc968e4f65655c0a56bda4e25824993</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Chaos</topic><topic>Computer networks</topic><topic>Distributed algorithms</topic><topic>Distributed computing</topic><topic>Internet</topic><topic>Iterative algorithms</topic><topic>Keyword search</topic><topic>Linear systems</topic><topic>Peer to peer computing</topic><topic>Telecommunication traffic</topic><toplevel>online_resources</toplevel><creatorcontrib>Sankaralingam, K.</creatorcontrib><creatorcontrib>Sethumadhavan, S.</creatorcontrib><creatorcontrib>Browne, J.C.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sankaralingam, K.</au><au>Sethumadhavan, S.</au><au>Browne, J.C.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Distributed pagerank for P2P systems</atitle><btitle>High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on</btitle><stitle>HPDC</stitle><date>2003</date><risdate>2003</risdate><spage>58</spage><epage>68</epage><pages>58-68</pages><issn>1082-8907</issn><isbn>0769519652</isbn><isbn>9780769519654</isbn><abstract>This paper defines and describes a fully distributed implementation of Google's highly effective pagerank algorithm, for "peer to peer" (P2P) systems. The implementation is based on chaotic (asynchronous) iterative solution of linear systems. The P2P implementation also enables incremental computation of pageranks as new documents are entered into or deleted from the network. Incremental update enables continuously accurate pageranks whereas the currently centralized web crawl and computation over Internet documents requires several days. This suggests possible applicability of the distributed algorithm to pagerank computations as a replacement for the centralized Web crawler based implementation for Internet documents. A complete solution of the distributed pagerank computation for an in-place network converges rapidly (1% accuracy in 10 iterations) for large systems although the time for iteration may be long. The incremental computation resulting from addition of a single document converges extremely rapidly, typically requiring update path lengths of fewer than 15 nodes even for large networks and very accurate solutions. This implementation of pagerank provides a uniform ranking scheme for documents in P2P systems, and its integration with P2P keyword search provides one solution to the network traffic problems engendered by return of document hits. In basic P2P keyword search, all the document hits must be returned to the querying node causing large network traffic. An incremental keyword search algorithm for P2P keyword search where document hits are sorted by pagerank, and incrementally returned to the querying node is proposed and evaluated. Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic. The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word querying.</abstract><pub>IEEE</pub><doi>10.1109/HPDC.2003.1210016</doi><tpages>11</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1082-8907
ispartof High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on, 2003, p.58-68
issn 1082-8907
language eng
recordid cdi_ieee_primary_1210016
source IEEE Xplore All Conference Series
subjects Chaos
Computer networks
Distributed algorithms
Distributed computing
Internet
Iterative algorithms
Keyword search
Linear systems
Peer to peer computing
Telecommunication traffic
title Distributed pagerank for P2P systems
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T14%3A53%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Distributed%20pagerank%20for%20P2P%20systems&rft.btitle=High%20Performance%20Distributed%20Computing,%202003.%20Proceedings.%2012th%20IEEE%20International%20Symposium%20on&rft.au=Sankaralingam,%20K.&rft.date=2003&rft.spage=58&rft.epage=68&rft.pages=58-68&rft.issn=1082-8907&rft.isbn=0769519652&rft.isbn_list=9780769519654&rft_id=info:doi/10.1109/HPDC.2003.1210016&rft_dat=%3Cproquest_CHZPO%3E31150428%3C/proquest_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c254t-284268a8012eff6bfe48277421fdb2d3d2dc968e4f65655c0a56bda4e25824993%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=31150428&rft_id=info:pmid/&rft_ieee_id=1210016&rfr_iscdi=true