Loading…

Reinforcement learning for combinatorial optimization: A survey

Many traditional algorithms for solving combinatorial optimization problems involve using hand-crafted heuristics that sequentially construct a solution. Such heuristics are designed by domain experts and may often be suboptimal due to the hard nature of the problems. Reinforcement learning (RL) pro...

Full description

Saved in:
Bibliographic Details
Published in:Computers & operations research 2021-10, Vol.134, p.105400, Article 105400
Main Authors: Mazyavkina, Nina, Sviridov, Sergey, Ivanov, Sergei, Burnaev, Evgeny
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c357t-648595ed4502e6c1dfb6c527fbf735cffb9197c53f8dd6a741bda4c86ebd571d3
cites cdi_FETCH-LOGICAL-c357t-648595ed4502e6c1dfb6c527fbf735cffb9197c53f8dd6a741bda4c86ebd571d3
container_end_page
container_issue
container_start_page 105400
container_title Computers & operations research
container_volume 134
creator Mazyavkina, Nina
Sviridov, Sergey
Ivanov, Sergei
Burnaev, Evgeny
description Many traditional algorithms for solving combinatorial optimization problems involve using hand-crafted heuristics that sequentially construct a solution. Such heuristics are designed by domain experts and may often be suboptimal due to the hard nature of the problems. Reinforcement learning (RL) proposes a good alternative to automate the search of these heuristics by training an agent in a supervised or self-supervised manner. In this survey, we explore the recent advancements of applying RL frameworks to hard combinatorial problems. Our survey provides the necessary background for operations research and machine learning communities and showcases the works that are moving the field forward. We juxtapose recently proposed RL methods, laying out the timeline of the improvements for each problem, as well as we make a comparison with traditional algorithms, indicating that RL models can become a promising direction for solving combinatorial problems. •Survey explores synergy of combinatorial optimization and reinforcement learning.•It covers recent papers showing how RL can be applied to solve canonical CO problems.•Graph neural networks are used as a method for graph representation in RL for CO.•Policy-based and value-based RL methods are used to learn the solutions to CO tasks.•Most of the reported experiments do not overlap.•This makes it hard to fairly compare the surveyed methods.
doi_str_mv 10.1016/j.cor.2021.105400
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2556028683</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0305054821001660</els_id><sourcerecordid>2556028683</sourcerecordid><originalsourceid>FETCH-LOGICAL-c357t-648595ed4502e6c1dfb6c527fbf735cffb9197c53f8dd6a741bda4c86ebd571d3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKs_wNuC563J7ibZ6kFKqR9QEETBW8gmE8nSTWqSFuqvN2U9O5dhZt53ZngQuiZ4RjBht_1M-TCrcEVyTRuMT9CEtLwuOaOfp2iCa0zLPGjP0UWMPc7BKzJBD29gnfFBwQAuFRuQwVn3VeRWofzQWSeTD1ZuCr9NdrA_Mlnv7opFEXdhD4dLdGbkJsLVX56ij8fV-_K5XL8-vSwX61LVlKeSNS2dU9ANxRUwRbTpmKIVN53hNVXGdHMy54rWptWaSd6QTstGtQw6TTnR9RTdjHu3wX_vICbR-11w-aSoKGW4allbZxUZVSr4GAMYsQ12kOEgCBZHTqIXmZM4chIjp-y5Hz2Q399bCCIqC06BtgFUEtrbf9y_pbFwzg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2556028683</pqid></control><display><type>article</type><title>Reinforcement learning for combinatorial optimization: A survey</title><source>ScienceDirect Freedom Collection 2022-2024</source><creator>Mazyavkina, Nina ; Sviridov, Sergey ; Ivanov, Sergei ; Burnaev, Evgeny</creator><creatorcontrib>Mazyavkina, Nina ; Sviridov, Sergey ; Ivanov, Sergei ; Burnaev, Evgeny</creatorcontrib><description>Many traditional algorithms for solving combinatorial optimization problems involve using hand-crafted heuristics that sequentially construct a solution. Such heuristics are designed by domain experts and may often be suboptimal due to the hard nature of the problems. Reinforcement learning (RL) proposes a good alternative to automate the search of these heuristics by training an agent in a supervised or self-supervised manner. In this survey, we explore the recent advancements of applying RL frameworks to hard combinatorial problems. Our survey provides the necessary background for operations research and machine learning communities and showcases the works that are moving the field forward. We juxtapose recently proposed RL methods, laying out the timeline of the improvements for each problem, as well as we make a comparison with traditional algorithms, indicating that RL models can become a promising direction for solving combinatorial problems. •Survey explores synergy of combinatorial optimization and reinforcement learning.•It covers recent papers showing how RL can be applied to solve canonical CO problems.•Graph neural networks are used as a method for graph representation in RL for CO.•Policy-based and value-based RL methods are used to learn the solutions to CO tasks.•Most of the reported experiments do not overlap.•This makes it hard to fairly compare the surveyed methods.</description><identifier>ISSN: 0305-0548</identifier><identifier>EISSN: 1873-765X</identifier><identifier>EISSN: 0305-0548</identifier><identifier>DOI: 10.1016/j.cor.2021.105400</identifier><language>eng</language><publisher>New York: Elsevier Ltd</publisher><subject>Algorithms ; Combinatorial analysis ; Combinatorial optimization ; Heuristic ; Machine learning ; Operations research ; Optimization ; Policy-based methods ; Reinforcement learning ; Value-based methods</subject><ispartof>Computers &amp; operations research, 2021-10, Vol.134, p.105400, Article 105400</ispartof><rights>2021 Elsevier Ltd</rights><rights>Copyright Pergamon Press Inc. Oct 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c357t-648595ed4502e6c1dfb6c527fbf735cffb9197c53f8dd6a741bda4c86ebd571d3</citedby><cites>FETCH-LOGICAL-c357t-648595ed4502e6c1dfb6c527fbf735cffb9197c53f8dd6a741bda4c86ebd571d3</cites><orcidid>0000-0002-0874-4165 ; 0000-0001-8424-0690 ; 0000-0003-3281-4984</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Mazyavkina, Nina</creatorcontrib><creatorcontrib>Sviridov, Sergey</creatorcontrib><creatorcontrib>Ivanov, Sergei</creatorcontrib><creatorcontrib>Burnaev, Evgeny</creatorcontrib><title>Reinforcement learning for combinatorial optimization: A survey</title><title>Computers &amp; operations research</title><description>Many traditional algorithms for solving combinatorial optimization problems involve using hand-crafted heuristics that sequentially construct a solution. Such heuristics are designed by domain experts and may often be suboptimal due to the hard nature of the problems. Reinforcement learning (RL) proposes a good alternative to automate the search of these heuristics by training an agent in a supervised or self-supervised manner. In this survey, we explore the recent advancements of applying RL frameworks to hard combinatorial problems. Our survey provides the necessary background for operations research and machine learning communities and showcases the works that are moving the field forward. We juxtapose recently proposed RL methods, laying out the timeline of the improvements for each problem, as well as we make a comparison with traditional algorithms, indicating that RL models can become a promising direction for solving combinatorial problems. •Survey explores synergy of combinatorial optimization and reinforcement learning.•It covers recent papers showing how RL can be applied to solve canonical CO problems.•Graph neural networks are used as a method for graph representation in RL for CO.•Policy-based and value-based RL methods are used to learn the solutions to CO tasks.•Most of the reported experiments do not overlap.•This makes it hard to fairly compare the surveyed methods.</description><subject>Algorithms</subject><subject>Combinatorial analysis</subject><subject>Combinatorial optimization</subject><subject>Heuristic</subject><subject>Machine learning</subject><subject>Operations research</subject><subject>Optimization</subject><subject>Policy-based methods</subject><subject>Reinforcement learning</subject><subject>Value-based methods</subject><issn>0305-0548</issn><issn>1873-765X</issn><issn>0305-0548</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKs_wNuC563J7ibZ6kFKqR9QEETBW8gmE8nSTWqSFuqvN2U9O5dhZt53ZngQuiZ4RjBht_1M-TCrcEVyTRuMT9CEtLwuOaOfp2iCa0zLPGjP0UWMPc7BKzJBD29gnfFBwQAuFRuQwVn3VeRWofzQWSeTD1ZuCr9NdrA_Mlnv7opFEXdhD4dLdGbkJsLVX56ij8fV-_K5XL8-vSwX61LVlKeSNS2dU9ANxRUwRbTpmKIVN53hNVXGdHMy54rWptWaSd6QTstGtQw6TTnR9RTdjHu3wX_vICbR-11w-aSoKGW4allbZxUZVSr4GAMYsQ12kOEgCBZHTqIXmZM4chIjp-y5Hz2Q399bCCIqC06BtgFUEtrbf9y_pbFwzg</recordid><startdate>20211001</startdate><enddate>20211001</enddate><creator>Mazyavkina, Nina</creator><creator>Sviridov, Sergey</creator><creator>Ivanov, Sergei</creator><creator>Burnaev, Evgeny</creator><general>Elsevier Ltd</general><general>Pergamon Press Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-0874-4165</orcidid><orcidid>https://orcid.org/0000-0001-8424-0690</orcidid><orcidid>https://orcid.org/0000-0003-3281-4984</orcidid></search><sort><creationdate>20211001</creationdate><title>Reinforcement learning for combinatorial optimization: A survey</title><author>Mazyavkina, Nina ; Sviridov, Sergey ; Ivanov, Sergei ; Burnaev, Evgeny</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c357t-648595ed4502e6c1dfb6c527fbf735cffb9197c53f8dd6a741bda4c86ebd571d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Combinatorial analysis</topic><topic>Combinatorial optimization</topic><topic>Heuristic</topic><topic>Machine learning</topic><topic>Operations research</topic><topic>Optimization</topic><topic>Policy-based methods</topic><topic>Reinforcement learning</topic><topic>Value-based methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mazyavkina, Nina</creatorcontrib><creatorcontrib>Sviridov, Sergey</creatorcontrib><creatorcontrib>Ivanov, Sergei</creatorcontrib><creatorcontrib>Burnaev, Evgeny</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computers &amp; operations research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mazyavkina, Nina</au><au>Sviridov, Sergey</au><au>Ivanov, Sergei</au><au>Burnaev, Evgeny</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reinforcement learning for combinatorial optimization: A survey</atitle><jtitle>Computers &amp; operations research</jtitle><date>2021-10-01</date><risdate>2021</risdate><volume>134</volume><spage>105400</spage><pages>105400-</pages><artnum>105400</artnum><issn>0305-0548</issn><eissn>1873-765X</eissn><eissn>0305-0548</eissn><abstract>Many traditional algorithms for solving combinatorial optimization problems involve using hand-crafted heuristics that sequentially construct a solution. Such heuristics are designed by domain experts and may often be suboptimal due to the hard nature of the problems. Reinforcement learning (RL) proposes a good alternative to automate the search of these heuristics by training an agent in a supervised or self-supervised manner. In this survey, we explore the recent advancements of applying RL frameworks to hard combinatorial problems. Our survey provides the necessary background for operations research and machine learning communities and showcases the works that are moving the field forward. We juxtapose recently proposed RL methods, laying out the timeline of the improvements for each problem, as well as we make a comparison with traditional algorithms, indicating that RL models can become a promising direction for solving combinatorial problems. •Survey explores synergy of combinatorial optimization and reinforcement learning.•It covers recent papers showing how RL can be applied to solve canonical CO problems.•Graph neural networks are used as a method for graph representation in RL for CO.•Policy-based and value-based RL methods are used to learn the solutions to CO tasks.•Most of the reported experiments do not overlap.•This makes it hard to fairly compare the surveyed methods.</abstract><cop>New York</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.cor.2021.105400</doi><orcidid>https://orcid.org/0000-0002-0874-4165</orcidid><orcidid>https://orcid.org/0000-0001-8424-0690</orcidid><orcidid>https://orcid.org/0000-0003-3281-4984</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0305-0548
ispartof Computers & operations research, 2021-10, Vol.134, p.105400, Article 105400
issn 0305-0548
1873-765X
0305-0548
language eng
recordid cdi_proquest_journals_2556028683
source ScienceDirect Freedom Collection 2022-2024
subjects Algorithms
Combinatorial analysis
Combinatorial optimization
Heuristic
Machine learning
Operations research
Optimization
Policy-based methods
Reinforcement learning
Value-based methods
title Reinforcement learning for combinatorial optimization: A survey
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T15%3A26%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reinforcement%20learning%20for%20combinatorial%20optimization:%20A%20survey&rft.jtitle=Computers%20&%20operations%20research&rft.au=Mazyavkina,%20Nina&rft.date=2021-10-01&rft.volume=134&rft.spage=105400&rft.pages=105400-&rft.artnum=105400&rft.issn=0305-0548&rft.eissn=1873-765X&rft_id=info:doi/10.1016/j.cor.2021.105400&rft_dat=%3Cproquest_cross%3E2556028683%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c357t-648595ed4502e6c1dfb6c527fbf735cffb9197c53f8dd6a741bda4c86ebd571d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2556028683&rft_id=info:pmid/&rfr_iscdi=true