Loading…

Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers

The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance...

Full description

Saved in:
Bibliographic Details
Published in:IEEE signal processing letters 2020, Vol.27, p.960-964
Main Authors: Diakoloukas, Vassilios, Lygerakis, Fotios, Lagoudakis, Michail G., Kotti, Margarita
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083
cites cdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083
container_end_page 964
container_issue
container_start_page 960
container_title IEEE signal processing letters
container_volume 27
creator Diakoloukas, Vassilios
Lygerakis, Fotios
Lagoudakis, Michail G.
Kotti, Margarita
description The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments.
doi_str_mv 10.1109/LSP.2020.2998361
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_LSP_2020_2998361</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9103219</ieee_id><sourcerecordid>2416008242</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</originalsourceid><addsrcrecordid>eNo9kE1LAzEYhBdRsFbvgpeA561vPnabHEv9KqxYqHpdstk3JaVu2mT30H9vasXTzGFmYJ4su6UwoRTUQ7VaThgwmDClJC_pWTaiRSFzlvx58jCFXCmQl9lVjBsAkFQWo2z9pYPTvfOd3pJH7LyLrluT2dB77IxvMUSiu5ZUqGOfr_aDDhjJ0m-dOZBFj-G3S6wPZNUnH3tnjktOb_16QPKmO71OI9fZhdXbiDd_Os4-n58-5q959f6ymM-q3HDO-1xKKywaO-WiBK2KhkPTWIuysTq95NS2ShQgpgLbEmSrFLbCGCOwbAQDycfZ_Wl3F_x-wNjXGz-EdC7WTNAy3WaCpRScUib4GAPaehfctw6HmkJ9xFknnPURZ_2HM1XuThWHiP9xRYEzqvgP5zhyfQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2416008242</pqid></control><display><type>article</type><title>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Diakoloukas, Vassilios ; Lygerakis, Fotios ; Lagoudakis, Michail G. ; Kotti, Margarita</creator><creatorcontrib>Diakoloukas, Vassilios ; Lygerakis, Fotios ; Lagoudakis, Michail G. ; Kotti, Margarita</creatorcontrib><description>The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments.</description><identifier>ISSN: 1070-9908</identifier><identifier>EISSN: 1558-2361</identifier><identifier>DOI: 10.1109/LSP.2020.2998361</identifier><identifier>CODEN: ISPLEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Approximation algorithms ; Degradation ; denoising ; dialogue systems ; Encoding ; Least squares ; least-squares policy iteration ; Management systems ; Noise ; Noise reduction ; Optimization ; Performance degradation ; sample-efficient statistical dialogue managers ; Signal processing algorithms ; Training ; Variational autoencoders</subject><ispartof>IEEE signal processing letters, 2020, Vol.27, p.960-964</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</citedby><cites>FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</cites><orcidid>0000-0002-1030-2892 ; 0000-0003-4212-7037 ; 0000-0001-8044-3511</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9103219$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,4024,27923,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Diakoloukas, Vassilios</creatorcontrib><creatorcontrib>Lygerakis, Fotios</creatorcontrib><creatorcontrib>Lagoudakis, Michail G.</creatorcontrib><creatorcontrib>Kotti, Margarita</creatorcontrib><title>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</title><title>IEEE signal processing letters</title><addtitle>LSP</addtitle><description>The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments.</description><subject>Algorithms</subject><subject>Approximation algorithms</subject><subject>Degradation</subject><subject>denoising</subject><subject>dialogue systems</subject><subject>Encoding</subject><subject>Least squares</subject><subject>least-squares policy iteration</subject><subject>Management systems</subject><subject>Noise</subject><subject>Noise reduction</subject><subject>Optimization</subject><subject>Performance degradation</subject><subject>sample-efficient statistical dialogue managers</subject><subject>Signal processing algorithms</subject><subject>Training</subject><subject>Variational autoencoders</subject><issn>1070-9908</issn><issn>1558-2361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNo9kE1LAzEYhBdRsFbvgpeA561vPnabHEv9KqxYqHpdstk3JaVu2mT30H9vasXTzGFmYJ4su6UwoRTUQ7VaThgwmDClJC_pWTaiRSFzlvx58jCFXCmQl9lVjBsAkFQWo2z9pYPTvfOd3pJH7LyLrluT2dB77IxvMUSiu5ZUqGOfr_aDDhjJ0m-dOZBFj-G3S6wPZNUnH3tnjktOb_16QPKmO71OI9fZhdXbiDd_Os4-n58-5q959f6ymM-q3HDO-1xKKywaO-WiBK2KhkPTWIuysTq95NS2ShQgpgLbEmSrFLbCGCOwbAQDycfZ_Wl3F_x-wNjXGz-EdC7WTNAy3WaCpRScUib4GAPaehfctw6HmkJ9xFknnPURZ_2HM1XuThWHiP9xRYEzqvgP5zhyfQ</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Diakoloukas, Vassilios</creator><creator>Lygerakis, Fotios</creator><creator>Lagoudakis, Michail G.</creator><creator>Kotti, Margarita</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1030-2892</orcidid><orcidid>https://orcid.org/0000-0003-4212-7037</orcidid><orcidid>https://orcid.org/0000-0001-8044-3511</orcidid></search><sort><creationdate>2020</creationdate><title>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</title><author>Diakoloukas, Vassilios ; Lygerakis, Fotios ; Lagoudakis, Michail G. ; Kotti, Margarita</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Approximation algorithms</topic><topic>Degradation</topic><topic>denoising</topic><topic>dialogue systems</topic><topic>Encoding</topic><topic>Least squares</topic><topic>least-squares policy iteration</topic><topic>Management systems</topic><topic>Noise</topic><topic>Noise reduction</topic><topic>Optimization</topic><topic>Performance degradation</topic><topic>sample-efficient statistical dialogue managers</topic><topic>Signal processing algorithms</topic><topic>Training</topic><topic>Variational autoencoders</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Diakoloukas, Vassilios</creatorcontrib><creatorcontrib>Lygerakis, Fotios</creatorcontrib><creatorcontrib>Lagoudakis, Michail G.</creatorcontrib><creatorcontrib>Kotti, Margarita</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE signal processing letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Diakoloukas, Vassilios</au><au>Lygerakis, Fotios</au><au>Lagoudakis, Michail G.</au><au>Kotti, Margarita</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers</atitle><jtitle>IEEE signal processing letters</jtitle><stitle>LSP</stitle><date>2020</date><risdate>2020</risdate><volume>27</volume><spage>960</spage><epage>964</epage><pages>960-964</pages><issn>1070-9908</issn><eissn>1558-2361</eissn><coden>ISPLEM</coden><abstract>The use of Reinforcement Learning (RL) approaches for dialogue policy optimization has been the new trend for dialogue management systems. Several methods have been proposed, which are trained on dialogue data to provide optimal system response. However, most of these approaches exhibit performance degradation in the presence of noise, poor scalability to other domains, as well as performance instabilities. To overcome these problems, we propose a novel approach based on the incremental, sample-efficient Least-Squares Policy Iteration (LSPI) algorithm, which is trained on compact, fixed-size dialogue state encodings, obtained from deep Variational Denoising Autoencoders (VDAE). The proposed scheme exhibits stable and noise-robust performance, which significantly outperforms the current state-of-the-art, even in mismatched noise environments.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/LSP.2020.2998361</doi><tpages>5</tpages><orcidid>https://orcid.org/0000-0002-1030-2892</orcidid><orcidid>https://orcid.org/0000-0003-4212-7037</orcidid><orcidid>https://orcid.org/0000-0001-8044-3511</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1070-9908
ispartof IEEE signal processing letters, 2020, Vol.27, p.960-964
issn 1070-9908
1558-2361
language eng
recordid cdi_crossref_primary_10_1109_LSP_2020_2998361
source IEEE Electronic Library (IEL) Journals
subjects Algorithms
Approximation algorithms
Degradation
denoising
dialogue systems
Encoding
Least squares
least-squares policy iteration
Management systems
Noise
Noise reduction
Optimization
Performance degradation
sample-efficient statistical dialogue managers
Signal processing algorithms
Training
Variational autoencoders
title Variational Denoising Autoencoders and Least-Squares Policy Iteration for Statistical Dialogue Managers
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T14%3A39%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Variational%20Denoising%20Autoencoders%20and%20Least-Squares%20Policy%20Iteration%20for%20Statistical%20Dialogue%20Managers&rft.jtitle=IEEE%20signal%20processing%20letters&rft.au=Diakoloukas,%20Vassilios&rft.date=2020&rft.volume=27&rft.spage=960&rft.epage=964&rft.pages=960-964&rft.issn=1070-9908&rft.eissn=1558-2361&rft.coden=ISPLEM&rft_id=info:doi/10.1109/LSP.2020.2998361&rft_dat=%3Cproquest_cross%3E2416008242%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c333t-88f4fecf73460a95b30bbffe8bfa10931fd9450474ed608d99ed4ccc4e6b42083%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2416008242&rft_id=info:pmid/&rft_ieee_id=9103219&rfr_iscdi=true