Loading…

Reward hierarchical temporal memory

In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia...

Full description

Saved in:
Bibliographic Details
Main Authors: Hansol Choi, Jun-Cheol Park, Jae Hyun Lim, Jae Young Jun, Dae-Shik Kim
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 7
container_issue
container_start_page 1
container_title
container_volume
creator Hansol Choi
Jun-Cheol Park
Jae Hyun Lim
Jae Young Jun
Dae-Shik Kim
description In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.
doi_str_mv 10.1109/IJCNN.2012.6252433
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_6252433</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6252433</ieee_id><sourcerecordid>6252433</sourcerecordid><originalsourceid>FETCH-ieee_primary_62524333</originalsourceid><addsrcrecordid>eNpjYJAyNNAzNDSw1Pf0cvbz0zMyMDTSMzMyNTIxNmZk4LU0tzA0MTM3NjSxsDRmYuA0MjQz1DUxMTBnRpGzMGOByRlbGnMw8BYXZxkAAVCFkaEJJ4NyUGp5YlGKQkZmalFiUXJGZnJijkJJam5BfhGQkZuam19UycPAmpaYU5zKC6W5GaTdXEOcPXQzU1NT4wuKMnMTiyrjoU4zxi8LAOBCNZw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Reward hierarchical temporal memory</title><source>IEEE Xplore All Conference Series</source><creator>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</creator><creatorcontrib>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</creatorcontrib><description>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</description><identifier>ISSN: 2161-4393</identifier><identifier>ISBN: 9781467314886</identifier><identifier>ISBN: 1467314889</identifier><identifier>EISSN: 2161-4407</identifier><identifier>EISBN: 9781467314893</identifier><identifier>EISBN: 9781467314909</identifier><identifier>EISBN: 1467314897</identifier><identifier>EISBN: 1467314900</identifier><identifier>DOI: 10.1109/IJCNN.2012.6252433</identifier><language>eng</language><publisher>IEEE</publisher><subject>Animals ; Brain modeling ; Computational modeling ; HTM ; Instruments ; Neurons ; Prediction algorithms ; Predictive models ; reinforcement learning ; reward ; reward prediction error ; reward-HTM ; rHTM ; temporal difference</subject><ispartof>The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, p.1-7</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6252433$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2057,27924,54554,54919,54931</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6252433$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hansol Choi</creatorcontrib><creatorcontrib>Jun-Cheol Park</creatorcontrib><creatorcontrib>Jae Hyun Lim</creatorcontrib><creatorcontrib>Jae Young Jun</creatorcontrib><creatorcontrib>Dae-Shik Kim</creatorcontrib><title>Reward hierarchical temporal memory</title><title>The 2012 International Joint Conference on Neural Networks (IJCNN)</title><addtitle>IJCNN</addtitle><description>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</description><subject>Animals</subject><subject>Brain modeling</subject><subject>Computational modeling</subject><subject>HTM</subject><subject>Instruments</subject><subject>Neurons</subject><subject>Prediction algorithms</subject><subject>Predictive models</subject><subject>reinforcement learning</subject><subject>reward</subject><subject>reward prediction error</subject><subject>reward-HTM</subject><subject>rHTM</subject><subject>temporal difference</subject><issn>2161-4393</issn><issn>2161-4407</issn><isbn>9781467314886</isbn><isbn>1467314889</isbn><isbn>9781467314893</isbn><isbn>9781467314909</isbn><isbn>1467314897</isbn><isbn>1467314900</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpjYJAyNNAzNDSw1Pf0cvbz0zMyMDTSMzMyNTIxNmZk4LU0tzA0MTM3NjSxsDRmYuA0MjQz1DUxMTBnRpGzMGOByRlbGnMw8BYXZxkAAVCFkaEJJ4NyUGp5YlGKQkZmalFiUXJGZnJijkJJam5BfhGQkZuam19UycPAmpaYU5zKC6W5GaTdXEOcPXQzU1NT4wuKMnMTiyrjoU4zxi8LAOBCNZw</recordid><startdate>201206</startdate><enddate>201206</enddate><creator>Hansol Choi</creator><creator>Jun-Cheol Park</creator><creator>Jae Hyun Lim</creator><creator>Jae Young Jun</creator><creator>Dae-Shik Kim</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201206</creationdate><title>Reward hierarchical temporal memory</title><author>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_62524333</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Animals</topic><topic>Brain modeling</topic><topic>Computational modeling</topic><topic>HTM</topic><topic>Instruments</topic><topic>Neurons</topic><topic>Prediction algorithms</topic><topic>Predictive models</topic><topic>reinforcement learning</topic><topic>reward</topic><topic>reward prediction error</topic><topic>reward-HTM</topic><topic>rHTM</topic><topic>temporal difference</topic><toplevel>online_resources</toplevel><creatorcontrib>Hansol Choi</creatorcontrib><creatorcontrib>Jun-Cheol Park</creatorcontrib><creatorcontrib>Jae Hyun Lim</creatorcontrib><creatorcontrib>Jae Young Jun</creatorcontrib><creatorcontrib>Dae-Shik Kim</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hansol Choi</au><au>Jun-Cheol Park</au><au>Jae Hyun Lim</au><au>Jae Young Jun</au><au>Dae-Shik Kim</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Reward hierarchical temporal memory</atitle><btitle>The 2012 International Joint Conference on Neural Networks (IJCNN)</btitle><stitle>IJCNN</stitle><date>2012-06</date><risdate>2012</risdate><spage>1</spage><epage>7</epage><pages>1-7</pages><issn>2161-4393</issn><eissn>2161-4407</eissn><isbn>9781467314886</isbn><isbn>1467314889</isbn><eisbn>9781467314893</eisbn><eisbn>9781467314909</eisbn><eisbn>1467314897</eisbn><eisbn>1467314900</eisbn><abstract>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</abstract><pub>IEEE</pub><doi>10.1109/IJCNN.2012.6252433</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2161-4393
ispartof The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, p.1-7
issn 2161-4393
2161-4407
language eng
recordid cdi_ieee_primary_6252433
source IEEE Xplore All Conference Series
subjects Animals
Brain modeling
Computational modeling
HTM
Instruments
Neurons
Prediction algorithms
Predictive models
reinforcement learning
reward
reward prediction error
reward-HTM
rHTM
temporal difference
title Reward hierarchical temporal memory
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T04%3A06%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Reward%20hierarchical%20temporal%20memory&rft.btitle=The%202012%20International%20Joint%20Conference%20on%20Neural%20Networks%20(IJCNN)&rft.au=Hansol%20Choi&rft.date=2012-06&rft.spage=1&rft.epage=7&rft.pages=1-7&rft.issn=2161-4393&rft.eissn=2161-4407&rft.isbn=9781467314886&rft.isbn_list=1467314889&rft_id=info:doi/10.1109/IJCNN.2012.6252433&rft.eisbn=9781467314893&rft.eisbn_list=9781467314909&rft.eisbn_list=1467314897&rft.eisbn_list=1467314900&rft_dat=%3Cieee_CHZPO%3E6252433%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-ieee_primary_62524333%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6252433&rfr_iscdi=true