Loading…

Reward hierarchical temporal memory

In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hansol Choi, Jun-Cheol Park, Jae Hyun Lim, Jae Young Jun, Dae-Shik Kim
Format:	Conference Proceeding
Language:	English
Subjects:	Animals Brain modeling Computational modeling HTM Instruments Neurons Prediction algorithms Predictive models reinforcement learning reward reward prediction error reward-HTM rHTM temporal difference
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	7
container_issue
container_start_page	1
container_title
container_volume
creator	Hansol Choi Jun-Cheol Park Jae Hyun Lim Jae Young Jun Dae-Shik Kim
description	In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.
doi_str_mv	10.1109/IJCNN.2012.6252433
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_6252433</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6252433</ieee_id><sourcerecordid>6252433</sourcerecordid><originalsourceid>FETCH-ieee_primary_62524333</originalsourceid><addsrcrecordid>eNpjYJAyNNAzNDSw1Pf0cvbz0zMyMDTSMzMyNTIxNmZk4LU0tzA0MTM3NjSxsDRmYuA0MjQz1DUxMTBnRpGzMGOByRlbGnMw8BYXZxkAAVCFkaEJJ4NyUGp5YlGKQkZmalFiUXJGZnJijkJJam5BfhGQkZuam19UycPAmpaYU5zKC6W5GaTdXEOcPXQzU1NT4wuKMnMTiyrjoU4zxi8LAOBCNZw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Reward hierarchical temporal memory</title><source>IEEE Xplore All Conference Series</source><creator>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</creator><creatorcontrib>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</creatorcontrib><description>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</description><identifier>ISSN: 2161-4393</identifier><identifier>ISBN: 9781467314886</identifier><identifier>ISBN: 1467314889</identifier><identifier>EISSN: 2161-4407</identifier><identifier>EISBN: 9781467314893</identifier><identifier>EISBN: 9781467314909</identifier><identifier>EISBN: 1467314897</identifier><identifier>EISBN: 1467314900</identifier><identifier>DOI: 10.1109/IJCNN.2012.6252433</identifier><language>eng</language><publisher>IEEE</publisher><subject>Animals ; Brain modeling ; Computational modeling ; HTM ; Instruments ; Neurons ; Prediction algorithms ; Predictive models ; reinforcement learning ; reward ; reward prediction error ; reward-HTM ; rHTM ; temporal difference</subject><ispartof>The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, p.1-7</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6252433$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2057,27924,54554,54919,54931</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6252433$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hansol Choi</creatorcontrib><creatorcontrib>Jun-Cheol Park</creatorcontrib><creatorcontrib>Jae Hyun Lim</creatorcontrib><creatorcontrib>Jae Young Jun</creatorcontrib><creatorcontrib>Dae-Shik Kim</creatorcontrib><title>Reward hierarchical temporal memory</title><title>The 2012 International Joint Conference on Neural Networks (IJCNN)</title><addtitle>IJCNN</addtitle><description>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</description><subject>Animals</subject><subject>Brain modeling</subject><subject>Computational modeling</subject><subject>HTM</subject><subject>Instruments</subject><subject>Neurons</subject><subject>Prediction algorithms</subject><subject>Predictive models</subject><subject>reinforcement learning</subject><subject>reward</subject><subject>reward prediction error</subject><subject>reward-HTM</subject><subject>rHTM</subject><subject>temporal difference</subject><issn>2161-4393</issn><issn>2161-4407</issn><isbn>9781467314886</isbn><isbn>1467314889</isbn><isbn>9781467314893</isbn><isbn>9781467314909</isbn><isbn>1467314897</isbn><isbn>1467314900</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpjYJAyNNAzNDSw1Pf0cvbz0zMyMDTSMzMyNTIxNmZk4LU0tzA0MTM3NjSxsDRmYuA0MjQz1DUxMTBnRpGzMGOByRlbGnMw8BYXZxkAAVCFkaEJJ4NyUGp5YlGKQkZmalFiUXJGZnJijkJJam5BfhGQkZuam19UycPAmpaYU5zKC6W5GaTdXEOcPXQzU1NT4wuKMnMTiyrjoU4zxi8LAOBCNZw</recordid><startdate>201206</startdate><enddate>201206</enddate><creator>Hansol Choi</creator><creator>Jun-Cheol Park</creator><creator>Jae Hyun Lim</creator><creator>Jae Young Jun</creator><creator>Dae-Shik Kim</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201206</creationdate><title>Reward hierarchical temporal memory</title><author>Hansol Choi ; Jun-Cheol Park ; Jae Hyun Lim ; Jae Young Jun ; Dae-Shik Kim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_62524333</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Animals</topic><topic>Brain modeling</topic><topic>Computational modeling</topic><topic>HTM</topic><topic>Instruments</topic><topic>Neurons</topic><topic>Prediction algorithms</topic><topic>Predictive models</topic><topic>reinforcement learning</topic><topic>reward</topic><topic>reward prediction error</topic><topic>reward-HTM</topic><topic>rHTM</topic><topic>temporal difference</topic><toplevel>online_resources</toplevel><creatorcontrib>Hansol Choi</creatorcontrib><creatorcontrib>Jun-Cheol Park</creatorcontrib><creatorcontrib>Jae Hyun Lim</creatorcontrib><creatorcontrib>Jae Young Jun</creatorcontrib><creatorcontrib>Dae-Shik Kim</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hansol Choi</au><au>Jun-Cheol Park</au><au>Jae Hyun Lim</au><au>Jae Young Jun</au><au>Dae-Shik Kim</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Reward hierarchical temporal memory</atitle><btitle>The 2012 International Joint Conference on Neural Networks (IJCNN)</btitle><stitle>IJCNN</stitle><date>2012-06</date><risdate>2012</risdate><spage>1</spage><epage>7</epage><pages>1-7</pages><issn>2161-4393</issn><eissn>2161-4407</eissn><isbn>9781467314886</isbn><isbn>1467314889</isbn><eisbn>9781467314893</eisbn><eisbn>9781467314909</eisbn><eisbn>1467314897</eisbn><eisbn>1467314900</eisbn><abstract>In humans and animals, reward prediction error encoded by dopamine systems is thought to be important in the temporal difference learning class of reinforcement learning (RL). With RL algorithms, many brain models have described the function of dopamine and related areas, including the basal ganglia and frontal cortex. In spite of this importance, how the reward prediction error itself is computed is not understood well, including the problem of how the current states are assigned to a memorized states and how the values of the states are memorized. In this paper, we describe a neocortical model for memorizing state space and computing reward prediction error, known as `reward hierarchical temporal memory' (rHTM). In this model, the temporal relationships among events are hierarchically stored. Using this memory, rHTM computes reward prediction errors by associating the memorized sequences to rewards and inhibits the predicted reward. In a simulation, our model behaved similarly to dopaminergic neurons. We suggest that our model can provide a hypothetical framework of interaction between cortex and dopamine neurons.</abstract><pub>IEEE</pub><doi>10.1109/IJCNN.2012.6252433</doi></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2161-4393
ispartof	The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, p.1-7
issn	2161-4393 2161-4407
language	eng
recordid	cdi_ieee_primary_6252433
source	IEEE Xplore All Conference Series
subjects	Animals Brain modeling Computational modeling HTM Instruments Neurons Prediction algorithms Predictive models reinforcement learning reward reward prediction error reward-HTM rHTM temporal difference
title	Reward hierarchical temporal memory
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T04%3A06%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Reward%20hierarchical%20temporal%20memory&rft.btitle=The%202012%20International%20Joint%20Conference%20on%20Neural%20Networks%20(IJCNN)&rft.au=Hansol%20Choi&rft.date=2012-06&rft.spage=1&rft.epage=7&rft.pages=1-7&rft.issn=2161-4393&rft.eissn=2161-4407&rft.isbn=9781467314886&rft.isbn_list=1467314889&rft_id=info:doi/10.1109/IJCNN.2012.6252433&rft.eisbn=9781467314893&rft.eisbn_list=9781467314909&rft.eisbn_list=1467314897&rft.eisbn_list=1467314900&rft_dat=%3Cieee_CHZPO%3E6252433%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-ieee_primary_62524333%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6252433&rfr_iscdi=true