Loading…
Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers
The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance...
Saved in:
Published in: | IEEE signal processing letters 2020, Vol.27, p.960-964 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083 |
---|---|
cites | cdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083 |
container_end_page | 964 |
container_issue | |
container_start_page | 960 |
container_title | IEEE signal processing letters |
container_volume | 27 |
creator | Diakoloukas, Vassilios Lygerakis, Fotios Lagoudakis, Michail G. Kotti, Margarita |
description | The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments. |
doi_str_mv | 10.1109/LSP.2020.2998361 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_LSP_2020_2998361</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9103219</ieee_id><sourcerecordid>2416008242</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</originalsourceid><addsrcrecordid>eNo9kE1LAzEYhBdRsFbvgpeA561vPnabHEv9KqxYqHpdstk3JaVu2mT30H9vasXTzGFmYJ4su6UwoRTUQ7VaThgwmDClJC_pWTaiRSFzlvx58jCFXCmQl9lVjBsAkFQWo2z9pYPTvfOd3pJH7LyLrluT2dB77IxvMUSiu5ZUqGOfr_aDDhjJ0m-dOZBFj-G3S6wPZNUnH3tnjktOb_16QPKmO71OI9fZhdXbiDd_Os4-n58-5q959f6ymM-q3HDO-1xKKywaO-WiBK2KhkPTWIuysTq95NS2ShQgpgLbEmSrFLbCGCOwbAQDycfZ_Wl3F_x-wNjXGz-EdC7WTNAy3WaCpRScUib4GAPaehfctw6HmkJ9xFknnPURZ_2HM1XuThWHiP9xRYEzqvgP5zhyfQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2416008242</pqid></control><display><type>article</type><title>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Diakoloukas, Vassilios ; Lygerakis, Fotios ; Lagoudakis, Michail G. ; Kotti, Margarita</creator><creatorcontrib>Diakoloukas, Vassilios ; Lygerakis, Fotios ; Lagoudakis, Michail G. ; Kotti, Margarita</creatorcontrib><description>The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments.</description><identifier>ISSN: 1070-9908</identifier><identifier>EISSN: 1558-2361</identifier><identifier>DOI: 10.1109/LSP.2020.2998361</identifier><identifier>CODEN: ISPLEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Approximation algorithms ; Degradation ; denoising ; dialogue systems ; Encoding ; Least squares ; least-squares policy iteration ; Management systems ; Noise ; Noise reduction ; Optimization ; Performance degradation ; sample-efficient statistical dialogue managers ; Signal processing algorithms ; Training ; Variational autoencoders</subject><ispartof>IEEE signal processing letters, 2020, Vol.27, p.960-964</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</citedby><cites>FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</cites><orcidid>0000-0002-1030-2892 ; 0000-0003-4212-7037 ; 0000-0001-8044-3511</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9103219$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,4024,27923,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Diakoloukas, Vassilios</creatorcontrib><creatorcontrib>Lygerakis, Fotios</creatorcontrib><creatorcontrib>Lagoudakis, Michail G.</creatorcontrib><creatorcontrib>Kotti, Margarita</creatorcontrib><title>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</title><title>IEEE signal processing letters</title><addtitle>LSP</addtitle><description>The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments.</description><subject>Algorithms</subject><subject>Approximation algorithms</subject><subject>Degradation</subject><subject>denoising</subject><subject>dialogue systems</subject><subject>Encoding</subject><subject>Least squares</subject><subject>least-squares policy iteration</subject><subject>Management systems</subject><subject>Noise</subject><subject>Noise reduction</subject><subject>Optimization</subject><subject>Performance degradation</subject><subject>sample-efficient statistical dialogue managers</subject><subject>Signal processing algorithms</subject><subject>Training</subject><subject>Variational autoencoders</subject><issn>1070-9908</issn><issn>1558-2361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNo9kE1LAzEYhBdRsFbvgpeA561vPnabHEv9KqxYqHpdstk3JaVu2mT30H9vasXTzGFmYJ4su6UwoRTUQ7VaThgwmDClJC_pWTaiRSFzlvx58jCFXCmQl9lVjBsAkFQWo2z9pYPTvfOd3pJH7LyLrluT2dB77IxvMUSiu5ZUqGOfr_aDDhjJ0m-dOZBFj-G3S6wPZNUnH3tnjktOb_16QPKmO71OI9fZhdXbiDd_Os4-n58-5q959f6ymM-q3HDO-1xKKywaO-WiBK2KhkPTWIuysTq95NS2ShQgpgLbEmSrFLbCGCOwbAQDycfZ_Wl3F_x-wNjXGz-EdC7WTNAy3WaCpRScUib4GAPaehfctw6HmkJ9xFknnPURZ_2HM1XuThWHiP9xRYEzqvgP5zhyfQ</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Diakoloukas, Vassilios</creator><creator>Lygerakis, Fotios</creator><creator>Lagoudakis, Michail G.</creator><creator>Kotti, Margarita</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1030-2892</orcidid><orcidid>https://orcid.org/0000-0003-4212-7037</orcidid><orcidid>https://orcid.org/0000-0001-8044-3511</orcidid></search><sort><creationdate>2020</creationdate><title>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</title><author>Diakoloukas, Vassilios ; Lygerakis, Fotios ; Lagoudakis, Michail G. ; Kotti, Margarita</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Approximation algorithms</topic><topic>Degradation</topic><topic>denoising</topic><topic>dialogue systems</topic><topic>Encoding</topic><topic>Least squares</topic><topic>least-squares policy iteration</topic><topic>Management systems</topic><topic>Noise</topic><topic>Noise reduction</topic><topic>Optimization</topic><topic>Performance degradation</topic><topic>sample-efficient statistical dialogue managers</topic><topic>Signal processing algorithms</topic><topic>Training</topic><topic>Variational autoencoders</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Diakoloukas, Vassilios</creatorcontrib><creatorcontrib>Lygerakis, Fotios</creatorcontrib><creatorcontrib>Lagoudakis, Michail G.</creatorcontrib><creatorcontrib>Kotti, Margarita</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE signal processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Diakoloukas, Vassilios</au><au>Lygerakis, Fotios</au><au>Lagoudakis, Michail G.</au><au>Kotti, Margarita</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</atitle><jtitle>IEEE signal processing letters</jtitle><stitle>LSP</stitle><date>2020</date><risdate>2020</risdate><volume>27</volume><spage>960</spage><epage>964</epage><pages>960-964</pages><issn>1070-9908</issn><eissn>1558-2361</eissn><coden>ISPLEM</coden><abstract>The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/LSP.2020.2998361</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-1030-2892</orcidid><orcidid>https://orcid.org/0000-0003-4212-7037</orcidid><orcidid>https://orcid.org/0000-0001-8044-3511</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1070-9908 |
ispartof | IEEE signal processing letters, 2020, Vol.27, p.960-964 |
issn | 1070-9908 1558-2361 |
language | eng |
recordid | cdi_crossref_primary_10_1109_LSP_2020_2998361 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Algorithms Approximation algorithms Degradation denoising dialogue systems Encoding Least squares least-squares policy iteration Management systems Noise Noise reduction Optimization Performance degradation sample-efficient statistical dialogue managers Signal processing algorithms Training Variational autoencoders |
title | Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T14%3A39%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Variational%20Denoising%20Autoencoders%20and%20Least-Squares%20Policy%20Iteration%20for%20Statistical%20Dialogue%20Managers&rft.jtitle=IEEE%20signal%20processing%20letters&rft.au=Diakoloukas,%20Vassilios&rft.date=2020&rft.volume=27&rft.spage=960&rft.epage=964&rft.pages=960-964&rft.issn=1070-9908&rft.eissn=1558-2361&rft.coden=ISPLEM&rft_id=info:doi/10.1109/LSP.2020.2998361&rft_dat=%3Cproquest_cross%3E2416008242%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2416008242&rft_id=info:pmid/&rft_ieee_id=9103219&rfr_iscdi=true |