Loading…

Gossip Learning with Linear Models on Fully Distributed Data

Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2012-06
Main Authors:	Ormándi, Róbert, Hegedüs, István, Jelasity, Márk
Format:	Article
Language:	English
Subjects:	Algorithms Communications systems Distance learning Machine learning Random walk Random walk theory Teaching methods
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Ormándi, Róbert Hegedüs, István Jelasity, Márk
description	Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.
doi_str_mv	10.48550/arxiv.1109.1396
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2086536551</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2086536551</sourcerecordid><originalsourceid>FETCH-LOGICAL-a511-dff1bf4beab86027eb0b2f3a03f6543a09e6833f635006bd15ad1a180fd6c8923</originalsourceid><addsrcrecordid>eNotjc9LwzAYQIMgOObuHgOeO78vX78sBS-yuU2oeNl9JCbRjNLOpvXHf29BT493eU-IG4RlaZjhzvbf6XOJCNUSqdIXYqaIsDClUldikfMJAJReKWaaiftdl3M6yzrYvk3tm_xKw7usUzu5fO58aLLsWrkdm-ZHblIe-uTGIXi5sYO9FpfRNjks_jkXh-3jYb0v6pfd0_qhLiwjFj5GdLF0wTqjQa2CA6ciWaCouZxYBW1oEmIA7Tyy9WjRQPT61VSK5uL2L3vuu48x5OF46sa-nY5HBUYzaWakX6OQSKw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2086536551</pqid></control><display><type>article</type><title>Gossip Learning with Linear Models on Fully Distributed Data</title><source>Publicly Available Content Database</source><creator>Ormándi, Róbert ; Hegedüs, István ; Jelasity, Márk</creator><creatorcontrib>Ormándi, Róbert ; Hegedüs, István ; Jelasity, Márk</creatorcontrib><description>Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.1109.1396</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Communications systems ; Distance learning ; Machine learning ; Random walk ; Random walk theory ; Teaching methods</subject><ispartof>arXiv.org, 2012-06</ispartof><rights>2012. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2086536551?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,27925,37012,44590</link.rule.ids></links><search><creatorcontrib>Ormándi, Róbert</creatorcontrib><creatorcontrib>Hegedüs, István</creatorcontrib><creatorcontrib>Jelasity, Márk</creatorcontrib><title>Gossip Learning with Linear Models on Fully Distributed Data</title><title>arXiv.org</title><description>Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.</description><subject>Algorithms</subject><subject>Communications systems</subject><subject>Distance learning</subject><subject>Machine learning</subject><subject>Random walk</subject><subject>Random walk theory</subject><subject>Teaching methods</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotjc9LwzAYQIMgOObuHgOeO78vX78sBS-yuU2oeNl9JCbRjNLOpvXHf29BT493eU-IG4RlaZjhzvbf6XOJCNUSqdIXYqaIsDClUldikfMJAJReKWaaiftdl3M6yzrYvk3tm_xKw7usUzu5fO58aLLsWrkdm-ZHblIe-uTGIXi5sYO9FpfRNjks_jkXh-3jYb0v6pfd0_qhLiwjFj5GdLF0wTqjQa2CA6ciWaCouZxYBW1oEmIA7Tyy9WjRQPT61VSK5uL2L3vuu48x5OF46sa-nY5HBUYzaWakX6OQSKw</recordid><startdate>20120606</startdate><enddate>20120606</enddate><creator>Ormándi, Róbert</creator><creator>Hegedüs, István</creator><creator>Jelasity, Márk</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20120606</creationdate><title>Gossip Learning with Linear Models on Fully Distributed Data</title><author>Ormándi, Róbert ; Hegedüs, István ; Jelasity, Márk</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a511-dff1bf4beab86027eb0b2f3a03f6543a09e6833f635006bd15ad1a180fd6c8923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Algorithms</topic><topic>Communications systems</topic><topic>Distance learning</topic><topic>Machine learning</topic><topic>Random walk</topic><topic>Random walk theory</topic><topic>Teaching methods</topic><toplevel>online_resources</toplevel><creatorcontrib>Ormándi, Róbert</creatorcontrib><creatorcontrib>Hegedüs, István</creatorcontrib><creatorcontrib>Jelasity, Márk</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ormándi, Róbert</au><au>Hegedüs, István</au><au>Jelasity, Márk</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Gossip Learning with Linear Models on Fully Distributed Data</atitle><jtitle>arXiv.org</jtitle><date>2012-06-06</date><risdate>2012</risdate><eissn>2331-8422</eissn><abstract>Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult, because there is no possibility to learn local models, the system model offers almost no guarantees for reliability, yet the communication cost needs to be kept low. Here we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method which---through the continuous combination of the models in the network---implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared to independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark datasets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.1109.1396</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2012-06
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2086536551
source	Publicly Available Content Database
subjects	Algorithms Communications systems Distance learning Machine learning Random walk Random walk theory Teaching methods
title	Gossip Learning with Linear Models on Fully Distributed Data
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T13%3A47%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Gossip%20Learning%20with%20Linear%20Models%20on%20Fully%20Distributed%20Data&rft.jtitle=arXiv.org&rft.au=Orm%C3%A1ndi,%20R%C3%B3bert&rft.date=2012-06-06&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.1109.1396&rft_dat=%3Cproquest%3E2086536551%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a511-dff1bf4beab86027eb0b2f3a03f6543a09e6833f635006bd15ad1a180fd6c8923%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2086536551&rft_id=info:pmid/&rfr_iscdi=true