Loading…
Gossip Learning with Linear Models on Fully Distributed Data
Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor...
Saved in:
Published in: | arXiv.org 2012-06 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Ormándi, Róbert Hegedüs, István Jelasity, Márk |
description | Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach. |
doi_str_mv | 10.48550/arxiv.1109.1396 |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2086536551</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2086536551</sourcerecordid><originalsourceid>FETCH-LOGICAL-a511-dff1bf4beab86027eb0b2f3a03f6543a09e6833f635006bd15ad1a180fd6c8923</originalsourceid><addsrcrecordid>eNotjc9LwzAYQIMgOObuHgOeO78vX78sBS-yuU2oeNl9JCbRjNLOpvXHf29BT493eU-IG4RlaZjhzvbf6XOJCNUSqdIXYqaIsDClUldikfMJAJReKWaaiftdl3M6yzrYvk3tm_xKw7usUzu5fO58aLLsWrkdm-ZHblIe-uTGIXi5sYO9FpfRNjks_jkXh-3jYb0v6pfd0_qhLiwjFj5GdLF0wTqjQa2CA6ciWaCouZxYBW1oEmIA7Tyy9WjRQPT61VSK5uL2L3vuu48x5OF46sa-nY5HBUYzaWakX6OQSKw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2086536551</pqid></control><display><type>article</type><title>Gossip Learning with Linear Models on Fully Distributed Data</title><source>Publicly Available Content Database</source><creator>Ormándi, Róbert ; Hegedüs, István ; Jelasity, Márk</creator><creatorcontrib>Ormándi, Róbert ; Hegedüs, István ; Jelasity, Márk</creatorcontrib><description>Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.1109.1396</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Communications systems ; Distance learning ; Machine learning ; Random walk ; Random walk theory ; Teaching methods</subject><ispartof>arXiv.org, 2012-06</ispartof><rights>2012. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2086536551?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,27925,37012,44590</link.rule.ids></links><search><creatorcontrib>Ormándi, Róbert</creatorcontrib><creatorcontrib>Hegedüs, István</creatorcontrib><creatorcontrib>Jelasity, Márk</creatorcontrib><title>Gossip Learning with Linear Models on Fully Distributed Data</title><title>arXiv.org</title><description>Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.</description><subject>Algorithms</subject><subject>Communications systems</subject><subject>Distance learning</subject><subject>Machine learning</subject><subject>Random walk</subject><subject>Random walk theory</subject><subject>Teaching methods</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotjc9LwzAYQIMgOObuHgOeO78vX78sBS-yuU2oeNl9JCbRjNLOpvXHf29BT493eU-IG4RlaZjhzvbf6XOJCNUSqdIXYqaIsDClUldikfMJAJReKWaaiftdl3M6yzrYvk3tm_xKw7usUzu5fO58aLLsWrkdm-ZHblIe-uTGIXi5sYO9FpfRNjks_jkXh-3jYb0v6pfd0_qhLiwjFj5GdLF0wTqjQa2CA6ciWaCouZxYBW1oEmIA7Tyy9WjRQPT61VSK5uL2L3vuu48x5OF46sa-nY5HBUYzaWakX6OQSKw</recordid><startdate>20120606</startdate><enddate>20120606</enddate><creator>Ormándi, Róbert</creator><creator>Hegedüs, István</creator><creator>Jelasity, Márk</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20120606</creationdate><title>Gossip Learning with Linear Models on Fully Distributed Data</title><author>Ormándi, Róbert ; Hegedüs, István ; Jelasity, Márk</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a511-dff1bf4beab86027eb0b2f3a03f6543a09e6833f635006bd15ad1a180fd6c8923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Communications systems</topic><topic>Distance learning</topic><topic>Machine learning</topic><topic>Random walk</topic><topic>Random walk theory</topic><topic>Teaching methods</topic><toplevel>online_resources</toplevel><creatorcontrib>Ormándi, Róbert</creatorcontrib><creatorcontrib>Hegedüs, István</creatorcontrib><creatorcontrib>Jelasity, Márk</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ormándi, Róbert</au><au>Hegedüs, István</au><au>Jelasity, Márk</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Gossip Learning with Linear Models on Fully Distributed Data</atitle><jtitle>arXiv.org</jtitle><date>2012-06-06</date><risdate>2012</risdate><eissn>2331-8422</eissn><abstract>Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.1109.1396</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2012-06 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2086536551 |
source | Publicly Available Content Database |
subjects | Algorithms Communications systems Distance learning Machine learning Random walk Random walk theory Teaching methods |
title | Gossip Learning with Linear Models on Fully Distributed Data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T13%3A47%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Gossip%20Learning%20with%20Linear%20Models%20on%20Fully%20Distributed%20Data&rft.jtitle=arXiv.org&rft.au=Orm%C3%A1ndi,%20R%C3%B3bert&rft.date=2012-06-06&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.1109.1396&rft_dat=%3Cproquest%3E2086536551%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a511-dff1bf4beab86027eb0b2f3a03f6543a09e6833f635006bd15ad1a180fd6c8923%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2086536551&rft_id=info:pmid/&rfr_iscdi=true |