Loading…

Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words

Learning sentence representation is a fundamental task in Natural Language Processing. Most of the existing sentence pair modelling architectures focus only on extracting and using the rich sentence pair features. The drawback of utilizing all of these features makes the learning process much harder...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmed, Mahtab, Mercer, Robert E.
Format: Conference Proceeding
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 7366
container_issue 5
container_start_page 7358
container_title
container_volume 34
creator Ahmed, Mahtab
Mercer, Robert E.
description Learning sentence representation is a fundamental task in Natural Language Processing. Most of the existing sentence pair modelling architectures focus only on extracting and using the rich sentence pair features. The drawback of utilizing all of these features makes the learning process much harder. In this study, we propose a reinforcement learning (RL) method to learn a sentence pair representation when performing tasks like semantic similarity, paraphrase identification, and question-answer pair modelling. We formulate this learning problem as a sequential decision making task where the decision made in the current state will have a strong impact on the following decisions. We address this decision making with a policy gradient RL method which chooses the irrelevant words to delete by looking at the sub-optimal representation of the sentences being compared. With this policy, extensive experiments show that our model achieves on par performance when learning task-specific representations of sentence pairs without needing any further knowledge like parse trees. We suggest that the simplicity of each task inference provided by our RL model makes it easier to explain.
doi_str_mv 10.1609/aaai.v34i05.6230
format conference_proceeding
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1609_aaai_v34i05_6230</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1609_aaai_v34i05_6230</sourcerecordid><originalsourceid>FETCH-LOGICAL-c126t-d53bdb76f1123b7b9687d2921885f5d47801ba73daae23b8f502b0ae12aadad43</originalsourceid><addsrcrecordid>eNotkMtOwzAQRS0EElXpnqV_IMGPOI7ZRRGPSkUgHmIZTewJNaRxZUeV-HtSldnMSHPuXRxCrjnLecnMDQD4_CALz1ReCsnOyEJIXWSyKKvz-ebKZEoac0lWKX2zeQrDOdcL8vMUHA6DH7_oG44TjhbpC_iY6MEDfUU_9iFa3M0_ukGI40ze0nqktZ1CzJroJ29pvd_HAHZLp3Ci6LRFuo4RBzzAHP0M0aUrctHDkHD1v5fk4_7uvXnMNs8P66beZJaLcsqckp3rdNlzLmSnO1NW2gkjeFWpXrlCV4x3oKUDwBmoesVExwC5AHDgCrkk7NRrY0gpYt_uo99B_G05a4--2qOv9uSrPfqSf3fCYHc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words</title><source>Freely Accessible Journals</source><creator>Ahmed, Mahtab ; Mercer, Robert E.</creator><creatorcontrib>Ahmed, Mahtab ; Mercer, Robert E.</creatorcontrib><description>Learning sentence representation is a fundamental task in Natural Language Processing. Most of the existing sentence pair modelling architectures focus only on extracting and using the rich sentence pair features. The drawback of utilizing all of these features makes the learning process much harder. In this study, we propose a reinforcement learning (RL) method to learn a sentence pair representation when performing tasks like semantic similarity, paraphrase identification, and question-answer pair modelling. We formulate this learning problem as a sequential decision making task where the decision made in the current state will have a strong impact on the following decisions. We address this decision making with a policy gradient RL method which chooses the irrelevant words to delete by looking at the sub-optimal representation of the sentences being compared. With this policy, extensive experiments show that our model achieves on par performance when learning task-specific representations of sentence pairs without needing any further knowledge like parse trees. We suggest that the simplicity of each task inference provided by our RL model makes it easier to explain.</description><identifier>ISSN: 2159-5399</identifier><identifier>EISSN: 2374-3468</identifier><identifier>DOI: 10.1609/aaai.v34i05.6230</identifier><language>eng</language><ispartof>Proceedings of the ... AAAI Conference on Artificial Intelligence, 2020, Vol.34 (5), p.7358-7366</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids></links><search><creatorcontrib>Ahmed, Mahtab</creatorcontrib><creatorcontrib>Mercer, Robert E.</creatorcontrib><title>Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words</title><title>Proceedings of the ... AAAI Conference on Artificial Intelligence</title><description>Learning sentence representation is a fundamental task in Natural Language Processing. Most of the existing sentence pair modelling architectures focus only on extracting and using the rich sentence pair features. The drawback of utilizing all of these features makes the learning process much harder. In this study, we propose a reinforcement learning (RL) method to learn a sentence pair representation when performing tasks like semantic similarity, paraphrase identification, and question-answer pair modelling. We formulate this learning problem as a sequential decision making task where the decision made in the current state will have a strong impact on the following decisions. We address this decision making with a policy gradient RL method which chooses the irrelevant words to delete by looking at the sub-optimal representation of the sentences being compared. With this policy, extensive experiments show that our model achieves on par performance when learning task-specific representations of sentence pairs without needing any further knowledge like parse trees. We suggest that the simplicity of each task inference provided by our RL model makes it easier to explain.</description><issn>2159-5399</issn><issn>2374-3468</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2020</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotkMtOwzAQRS0EElXpnqV_IMGPOI7ZRRGPSkUgHmIZTewJNaRxZUeV-HtSldnMSHPuXRxCrjnLecnMDQD4_CALz1ReCsnOyEJIXWSyKKvz-ebKZEoac0lWKX2zeQrDOdcL8vMUHA6DH7_oG44TjhbpC_iY6MEDfUU_9iFa3M0_ukGI40ze0nqktZ1CzJroJ29pvd_HAHZLp3Ci6LRFuo4RBzzAHP0M0aUrctHDkHD1v5fk4_7uvXnMNs8P66beZJaLcsqckp3rdNlzLmSnO1NW2gkjeFWpXrlCV4x3oKUDwBmoesVExwC5AHDgCrkk7NRrY0gpYt_uo99B_G05a4--2qOv9uSrPfqSf3fCYHc</recordid><startdate>20200403</startdate><enddate>20200403</enddate><creator>Ahmed, Mahtab</creator><creator>Mercer, Robert E.</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20200403</creationdate><title>Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words</title><author>Ahmed, Mahtab ; Mercer, Robert E.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c126t-d53bdb76f1123b7b9687d2921885f5d47801ba73daae23b8f502b0ae12aadad43</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2020</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Ahmed, Mahtab</creatorcontrib><creatorcontrib>Mercer, Robert E.</creatorcontrib><collection>CrossRef</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ahmed, Mahtab</au><au>Mercer, Robert E.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words</atitle><btitle>Proceedings of the ... AAAI Conference on Artificial Intelligence</btitle><date>2020-04-03</date><risdate>2020</risdate><volume>34</volume><issue>5</issue><spage>7358</spage><epage>7366</epage><pages>7358-7366</pages><issn>2159-5399</issn><eissn>2374-3468</eissn><abstract>Learning sentence representation is a fundamental task in Natural Language Processing. Most of the existing sentence pair modelling architectures focus only on extracting and using the rich sentence pair features. The drawback of utilizing all of these features makes the learning process much harder. In this study, we propose a reinforcement learning (RL) method to learn a sentence pair representation when performing tasks like semantic similarity, paraphrase identification, and question-answer pair modelling. We formulate this learning problem as a sequential decision making task where the decision made in the current state will have a strong impact on the following decisions. We address this decision making with a policy gradient RL method which chooses the irrelevant words to delete by looking at the sub-optimal representation of the sentences being compared. With this policy, extensive experiments show that our model achieves on par performance when learning task-specific representations of sentence pairs without needing any further knowledge like parse trees. We suggest that the simplicity of each task inference provided by our RL model makes it easier to explain.</abstract><doi>10.1609/aaai.v34i05.6230</doi><tpages>9</tpages></addata></record>
fulltext fulltext
identifier ISSN: 2159-5399
ispartof Proceedings of the ... AAAI Conference on Artificial Intelligence, 2020, Vol.34 (5), p.7358-7366
issn 2159-5399
2374-3468
language eng
recordid cdi_crossref_primary_10_1609_aaai_v34i05_6230
source Freely Accessible Journals
title Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T05%3A43%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Modelling%20Sentence%20Pairs%20via%20Reinforcement%20Learning:%20An%20Actor-Critic%20Approach%20to%20Learn%20the%20Irrelevant%20Words&rft.btitle=Proceedings%20of%20the%20...%20AAAI%20Conference%20on%20Artificial%20Intelligence&rft.au=Ahmed,%20Mahtab&rft.date=2020-04-03&rft.volume=34&rft.issue=5&rft.spage=7358&rft.epage=7366&rft.pages=7358-7366&rft.issn=2159-5399&rft.eissn=2374-3468&rft_id=info:doi/10.1609/aaai.v34i05.6230&rft_dat=%3Ccrossref%3E10_1609_aaai_v34i05_6230%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c126t-d53bdb76f1123b7b9687d2921885f5d47801ba73daae23b8f502b0ae12aadad43%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true