Loading…

Markov chain importance sampling for minibatches

This study investigates importance sampling under the scheme of minibatch stochastic gradient descent, under which the contributions are twofold. First, theoretically, we develop a neat tilting formula, which can be regarded as a general device for asymptotically optimal importance sampling. Second,...

Full description

Saved in:

Bibliographic Details
Published in:	Machine learning 2024-02, Vol.113 (2), p.789-814
Main Authors:	Fuh, Cheng-Der, Wang, Chuan-Ju, Pai, Chen-Hung
Format:	Article
Language:	English
Subjects:	Algorithms Artificial Intelligence Artificial neural networks Computer Science Control Importance sampling Machine Learning Markov analysis Markov chains Mechatronics Multilayer perceptrons Natural Language Processing (NLP) Normal distribution Robotics Simulation and Modeling
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c319t-6038208acc6221a4dd3dca61215dc671cbcf5c9288f762aa9469d062c1f38ef53
cites	cdi_FETCH-LOGICAL-c319t-6038208acc6221a4dd3dca61215dc671cbcf5c9288f762aa9469d062c1f38ef53
container_end_page	814
container_issue	2
container_start_page	789
container_title	Machine learning
container_volume	113
creator	Fuh, Cheng-Der Wang, Chuan-Ju Pai, Chen-Hung
description	This study investigates importance sampling under the scheme of minibatch stochastic gradient descent, under which the contributions are twofold. First, theoretically, we develop a neat tilting formula, which can be regarded as a general device for asymptotically optimal importance sampling. Second, practically, guided by the formula, we present an effective algorithm for importance sampling which accounts for the effects of minibatches and leverages the Markovian property of the gradients between iterations. Experiments conducted on artificial data confirm that our algorithm consistently delivers superior performance in terms of variance reduction. Furthermore, experiments carried out on real-world data demonstrate that our method, when paired with relatively straightforward models like multilayer perceptron and convolutional neural networks, outperforms in terms of training loss and testing error.
doi_str_mv	10.1007/s10994-023-06489-5
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2916260952</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2916260952</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-6038208acc6221a4dd3dca61215dc671cbcf5c9288f762aa9469d062c1f38ef53</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJ0kyToyx-wYoXPYdsmux23TY16Qr-e7tW8OZpGHjed4aHkEsG1wyguskMtC4pcEEBS6WpPCIzJqtxlSiPyQyUkhQZl6fkLOctAHBUOCPwbNN7_CzcxjZd0bR9TIPtnC-ybftd062LEFPRNl2zsoPb-HxOToLdZX_xO-fk7f7udfFIly8PT4vbJXWC6YEiCMVBWeeQc2bLuha1s-MDTNYOK-ZWLkinuVKhQm6tLlHXgNyxIJQPUszJ1dTbp_ix93kw27hP3XjScM2QI2jJR4pPlEsx5-SD6VPT2vRlGJiDGTOZMaMZ82PGHKrFFMoj3K19-qv-J_UNPBdlag</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2916260952</pqid></control><display><type>article</type><title>Markov chain importance sampling for minibatches</title><source>Springer Nature</source><creator>Fuh, Cheng-Der ; Wang, Chuan-Ju ; Pai, Chen-Hung</creator><creatorcontrib>Fuh, Cheng-Der ; Wang, Chuan-Ju ; Pai, Chen-Hung</creatorcontrib><description>This study investigates importance sampling under the scheme of minibatch stochastic gradient descent, under which the contributions are twofold. First, theoretically, we develop a neat tilting formula, which can be regarded as a general device for asymptotically optimal importance sampling. Second, practically, guided by the formula, we present an effective algorithm for importance sampling which accounts for the effects of minibatches and leverages the Markovian property of the gradients between iterations. Experiments conducted on artificial data confirm that our algorithm consistently delivers superior performance in terms of variance reduction. Furthermore, experiments carried out on real-world data demonstrate that our method, when paired with relatively straightforward models like multilayer perceptron and convolutional neural networks, outperforms in terms of training loss and testing error.</description><identifier>ISSN: 0885-6125</identifier><identifier>EISSN: 1573-0565</identifier><identifier>DOI: 10.1007/s10994-023-06489-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial Intelligence ; Artificial neural networks ; Computer Science ; Control ; Importance sampling ; Machine Learning ; Markov analysis ; Markov chains ; Mechatronics ; Multilayer perceptrons ; Natural Language Processing (NLP) ; Normal distribution ; Robotics ; Simulation and Modeling</subject><ispartof>Machine learning, 2024-02, Vol.113 (2), p.789-814</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-6038208acc6221a4dd3dca61215dc671cbcf5c9288f762aa9469d062c1f38ef53</citedby><cites>FETCH-LOGICAL-c319t-6038208acc6221a4dd3dca61215dc671cbcf5c9288f762aa9469d062c1f38ef53</cites><orcidid>0000-0002-5281-2962</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Fuh, Cheng-Der</creatorcontrib><creatorcontrib>Wang, Chuan-Ju</creatorcontrib><creatorcontrib>Pai, Chen-Hung</creatorcontrib><title>Markov chain importance sampling for minibatches</title><title>Machine learning</title><addtitle>Mach Learn</addtitle><description>This study investigates importance sampling under the scheme of minibatch stochastic gradient descent, under which the contributions are twofold. First, theoretically, we develop a neat tilting formula, which can be regarded as a general device for asymptotically optimal importance sampling. Second, practically, guided by the formula, we present an effective algorithm for importance sampling which accounts for the effects of minibatches and leverages the Markovian property of the gradients between iterations. Experiments conducted on artificial data confirm that our algorithm consistently delivers superior performance in terms of variance reduction. Furthermore, experiments carried out on real-world data demonstrate that our method, when paired with relatively straightforward models like multilayer perceptron and convolutional neural networks, outperforms in terms of training loss and testing error.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Computer Science</subject><subject>Control</subject><subject>Importance sampling</subject><subject>Machine Learning</subject><subject>Markov analysis</subject><subject>Markov chains</subject><subject>Mechatronics</subject><subject>Multilayer perceptrons</subject><subject>Natural Language Processing (NLP)</subject><subject>Normal distribution</subject><subject>Robotics</subject><subject>Simulation and Modeling</subject><issn>0885-6125</issn><issn>1573-0565</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJ0kyToyx-wYoXPYdsmux23TY16Qr-e7tW8OZpGHjed4aHkEsG1wyguskMtC4pcEEBS6WpPCIzJqtxlSiPyQyUkhQZl6fkLOctAHBUOCPwbNN7_CzcxjZd0bR9TIPtnC-ybftd062LEFPRNl2zsoPb-HxOToLdZX_xO-fk7f7udfFIly8PT4vbJXWC6YEiCMVBWeeQc2bLuha1s-MDTNYOK-ZWLkinuVKhQm6tLlHXgNyxIJQPUszJ1dTbp_ix93kw27hP3XjScM2QI2jJR4pPlEsx5-SD6VPT2vRlGJiDGTOZMaMZ82PGHKrFFMoj3K19-qv-J_UNPBdlag</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Fuh, Cheng-Der</creator><creator>Wang, Chuan-Ju</creator><creator>Pai, Chen-Hung</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5281-2962</orcidid></search><sort><creationdate>20240201</creationdate><title>Markov chain importance sampling for minibatches</title><author>Fuh, Cheng-Der ; Wang, Chuan-Ju ; Pai, Chen-Hung</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-6038208acc6221a4dd3dca61215dc671cbcf5c9288f762aa9469d062c1f38ef53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Computer Science</topic><topic>Control</topic><topic>Importance sampling</topic><topic>Machine Learning</topic><topic>Markov analysis</topic><topic>Markov chains</topic><topic>Mechatronics</topic><topic>Multilayer perceptrons</topic><topic>Natural Language Processing (NLP)</topic><topic>Normal distribution</topic><topic>Robotics</topic><topic>Simulation and Modeling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fuh, Cheng-Der</creatorcontrib><creatorcontrib>Wang, Chuan-Ju</creatorcontrib><creatorcontrib>Pai, Chen-Hung</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Machine learning</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fuh, Cheng-Der</au><au>Wang, Chuan-Ju</au><au>Pai, Chen-Hung</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Markov chain importance sampling for minibatches</atitle><jtitle>Machine learning</jtitle><stitle>Mach Learn</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>113</volume><issue>2</issue><spage>789</spage><epage>814</epage><pages>789-814</pages><issn>0885-6125</issn><eissn>1573-0565</eissn><abstract>This study investigates importance sampling under the scheme of minibatch stochastic gradient descent, under which the contributions are twofold. First, theoretically, we develop a neat tilting formula, which can be regarded as a general device for asymptotically optimal importance sampling. Second, practically, guided by the formula, we present an effective algorithm for importance sampling which accounts for the effects of minibatches and leverages the Markovian property of the gradients between iterations. Experiments conducted on artificial data confirm that our algorithm consistently delivers superior performance in terms of variance reduction. Furthermore, experiments carried out on real-world data demonstrate that our method, when paired with relatively straightforward models like multilayer perceptron and convolutional neural networks, outperforms in terms of training loss and testing error.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10994-023-06489-5</doi><tpages>26</tpages><orcidid>https://orcid.org/0000-0002-5281-2962</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0885-6125
ispartof	Machine learning, 2024-02, Vol.113 (2), p.789-814
issn	0885-6125 1573-0565
language	eng
recordid	cdi_proquest_journals_2916260952
source	Springer Nature
subjects	Algorithms Artificial Intelligence Artificial neural networks Computer Science Control Importance sampling Machine Learning Markov analysis Markov chains Mechatronics Multilayer perceptrons Natural Language Processing (NLP) Normal distribution Robotics Simulation and Modeling
title	Markov chain importance sampling for minibatches
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T09%3A05%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Markov%20chain%20importance%20sampling%20for%20minibatches&rft.jtitle=Machine%20learning&rft.au=Fuh,%20Cheng-Der&rft.date=2024-02-01&rft.volume=113&rft.issue=2&rft.spage=789&rft.epage=814&rft.pages=789-814&rft.issn=0885-6125&rft.eissn=1573-0565&rft_id=info:doi/10.1007/s10994-023-06489-5&rft_dat=%3Cproquest_cross%3E2916260952%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-6038208acc6221a4dd3dca61215dc671cbcf5c9288f762aa9469d062c1f38ef53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2916260952&rft_id=info:pmid/&rfr_iscdi=true