Loading…
A model of hippocampally dependent navigation, using the temporal difference learning rule
This paper presents a model of how hippocampal place cells might be used for spatial navigation in two watermaze tasks: the standard reference memory task and a delayed matching‐to‐place task. In the reference memory task, the escape platform occupies a single location and rats gradually learn relat...
Saved in:
Published in: | Hippocampus 2000, Vol.10 (1), p.1-16 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper presents a model of how hippocampal place cells might be used for spatial navigation in two watermaze tasks: the standard reference memory task and a delayed matching‐to‐place task. In the reference memory task, the escape platform occupies a single location and rats gradually learn relatively direct paths to the goal over the course of days, in each of which they perform a fixed number of trials. In the delayed matching‐to‐place task, the escape platform occupies a novel location on each day, and rats gradually acquire one‐trial learning, i.e., direct paths on the second trial of each day. The model uses a local, incremental, and statistically efficient connectionist algorithm called temporal difference learning in two distinct components. The first is a reinforcement‐based “actor‐critic” network that is a general model of classical and instrumental conditioning. In this case, it is applied to navigation, using place cells to provide information about state. By itself, the actor‐critic can learn the reference memory task, but this learning is inflexible to changes to the platform location. We argue that one‐trial learning in the delayed matching‐to‐place task demands a goal‐independent representation of space. This is provided by the second component of the model: a network that uses temporal difference learning and self‐motion information to acquire consistent spatial coordinates in the environment. Each component of the model is necessary at a different stage of the task; the actor‐critic provides a way of transferring control to the component that performs best. The model successfully captures gradual acquisition in both tasks, and, in particular, the ultimate development of one‐trial learning in the delayed matching‐to‐place task. Place cells report a form of stable, allocentric information that is well‐suited to the various kinds of learning in the model. Hippocampus 2000;10:1–16. © 2000 Wiley‐Liss, Inc. |
---|---|
ISSN: | 1050-9631 1098-1063 |
DOI: | 10.1002/(SICI)1098-1063(2000)10:1<1::AID-HIPO1>3.0.CO;2-1 |