Loading…
Reward hierarchical temporal memory
In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 7 |
container_issue | |
container_start_page | 1 |
container_title | |
container_volume | |
creator | Hansol Choi Jun-Cheol Park Jae Hyun Lim Jae Young Jun Dae-Shik Kim |
description | In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons. |
doi_str_mv | 10.1109/IJCNN.2012.6252433 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_6252433</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6252433</ieee_id><sourcerecordid>6252433</sourcerecordid><originalsourceid>FETCH-ieee_primary_62524333</originalsourceid><addsrcrecordid>eNpjYJAyNNAzNDSw1Pf0cvbz0zMyMDTSMzMyNTIxNmZk4LU0tzA0MTM3NjSxsDRmYuA0MjQz1DUxMTBnRpGzMGOByRlbGnMw8BYXZxkAAVCFkaEJJ4NyUGp5YlGKQkZmalFiUXJGZnJijkJJam5BfhGQkZuam19UycPAmpaYU5zKC6W5GaTdXEOcPXQzU1NT4wuKMnMTiyrjoU4zxi8LAOBCNZw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Reward hierarchical temporal memory</title><source>IEEE Xplore All Conference Series</source><creator>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</creator><creatorcontrib>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</creatorcontrib><description>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</description><identifier>ISSN: 2161-4393</identifier><identifier>ISBN: 9781467314886</identifier><identifier>ISBN: 1467314889</identifier><identifier>EISSN: 2161-4407</identifier><identifier>EISBN: 9781467314893</identifier><identifier>EISBN: 9781467314909</identifier><identifier>EISBN: 1467314897</identifier><identifier>EISBN: 1467314900</identifier><identifier>DOI: 10.1109/IJCNN.2012.6252433</identifier><language>eng</language><publisher>IEEE</publisher><subject>Animals ; Brain modeling ; Computational modeling ; HTM ; Instruments ; Neurons ; Prediction algorithms ; Predictive models ; reinforcement learning ; reward ; reward prediction error ; reward-HTM ; rHTM ; temporal difference</subject><ispartof>The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, p.1-7</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6252433$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2057,27924,54554,54919,54931</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6252433$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hansol Choi</creatorcontrib><creatorcontrib>Jun-Cheol Park</creatorcontrib><creatorcontrib>Jae Hyun Lim</creatorcontrib><creatorcontrib>Jae Young Jun</creatorcontrib><creatorcontrib>Dae-Shik Kim</creatorcontrib><title>Reward hierarchical temporal memory</title><title>The 2012 International Joint Conference on Neural Networks (IJCNN)</title><addtitle>IJCNN</addtitle><description>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</description><subject>Animals</subject><subject>Brain modeling</subject><subject>Computational modeling</subject><subject>HTM</subject><subject>Instruments</subject><subject>Neurons</subject><subject>Prediction algorithms</subject><subject>Predictive models</subject><subject>reinforcement learning</subject><subject>reward</subject><subject>reward prediction error</subject><subject>reward-HTM</subject><subject>rHTM</subject><subject>temporal difference</subject><issn>2161-4393</issn><issn>2161-4407</issn><isbn>9781467314886</isbn><isbn>1467314889</isbn><isbn>9781467314893</isbn><isbn>9781467314909</isbn><isbn>1467314897</isbn><isbn>1467314900</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpjYJAyNNAzNDSw1Pf0cvbz0zMyMDTSMzMyNTIxNmZk4LU0tzA0MTM3NjSxsDRmYuA0MjQz1DUxMTBnRpGzMGOByRlbGnMw8BYXZxkAAVCFkaEJJ4NyUGp5YlGKQkZmalFiUXJGZnJijkJJam5BfhGQkZuam19UycPAmpaYU5zKC6W5GaTdXEOcPXQzU1NT4wuKMnMTiyrjoU4zxi8LAOBCNZw</recordid><startdate>201206</startdate><enddate>201206</enddate><creator>Hansol Choi</creator><creator>Jun-Cheol Park</creator><creator>Jae Hyun Lim</creator><creator>Jae Young Jun</creator><creator>Dae-Shik Kim</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201206</creationdate><title>Reward hierarchical temporal memory</title><author>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_62524333</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Animals</topic><topic>Brain modeling</topic><topic>Computational modeling</topic><topic>HTM</topic><topic>Instruments</topic><topic>Neurons</topic><topic>Prediction algorithms</topic><topic>Predictive models</topic><topic>reinforcement learning</topic><topic>reward</topic><topic>reward prediction error</topic><topic>reward-HTM</topic><topic>rHTM</topic><topic>temporal difference</topic><toplevel>online_resources</toplevel><creatorcontrib>Hansol Choi</creatorcontrib><creatorcontrib>Jun-Cheol Park</creatorcontrib><creatorcontrib>Jae Hyun Lim</creatorcontrib><creatorcontrib>Jae Young Jun</creatorcontrib><creatorcontrib>Dae-Shik Kim</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hansol Choi</au><au>Jun-Cheol Park</au><au>Jae Hyun Lim</au><au>Jae Young Jun</au><au>Dae-Shik Kim</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Reward hierarchical temporal memory</atitle><btitle>The 2012 International Joint Conference on Neural Networks (IJCNN)</btitle><stitle>IJCNN</stitle><date>2012-06</date><risdate>2012</risdate><spage>1</spage><epage>7</epage><pages>1-7</pages><issn>2161-4393</issn><eissn>2161-4407</eissn><isbn>9781467314886</isbn><isbn>1467314889</isbn><eisbn>9781467314893</eisbn><eisbn>9781467314909</eisbn><eisbn>1467314897</eisbn><eisbn>1467314900</eisbn><abstract>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</abstract><pub>IEEE</pub><doi>10.1109/IJCNN.2012.6252433</doi></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2161-4393 |
ispartof | The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, p.1-7 |
issn | 2161-4393 2161-4407 |
language | eng |
recordid | cdi_ieee_primary_6252433 |
source | IEEE Xplore All Conference Series |
subjects | Animals Brain modeling Computational modeling HTM Instruments Neurons Prediction algorithms Predictive models reinforcement learning reward reward prediction error reward-HTM rHTM temporal difference |
title | Reward hierarchical temporal memory |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T04%3A06%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Reward%20hierarchical%20temporal%20memory&rft.btitle=The%202012%20International%20Joint%20Conference%20on%20Neural%20Networks%20(IJCNN)&rft.au=Hansol%20Choi&rft.date=2012-06&rft.spage=1&rft.epage=7&rft.pages=1-7&rft.issn=2161-4393&rft.eissn=2161-4407&rft.isbn=9781467314886&rft.isbn_list=1467314889&rft_id=info:doi/10.1109/IJCNN.2012.6252433&rft.eisbn=9781467314893&rft.eisbn_list=9781467314909&rft.eisbn_list=1467314897&rft.eisbn_list=1467314900&rft_dat=%3Cieee_CHZPO%3E6252433%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-ieee_primary_62524333%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6252433&rfr_iscdi=true |