Loading…

Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments

Humans possess an exceptional aptitude to efficiently make decisions from high-dimensional sensory observations. However, it is unknown how the brain compactly represents the current state of the environment to guide this process. The deep Q-network (DQN) achieves this by capturing highly nonlinear...

Full description

Saved in:
Bibliographic Details
Published in:Neuron (Cambridge, Mass.) Mass.), 2021-02, Vol.109 (4), p.724-738.e7
Main Authors: Cross, Logan, Cockburn, Jeff, Yue, Yisong, O’Doherty, John P.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Humans possess an exceptional aptitude to efficiently make decisions from high-dimensional sensory observations. However, it is unknown how the brain compactly represents the current state of the environment to guide this process. The deep Q-network (DQN) achieves this by capturing highly nonlinear mappings from multivariate inputs to the values of potential actions. We deployed DQN as a model of brain activity and behavior in participants playing three Atari video games during fMRI. Hidden layers of DQN exhibited a striking resemblance to voxel activity in a distributed sensorimotor network, extending throughout the dorsal visual pathway into posterior parietal cortex. Neural state-space representations emerged from nonlinear transformations of the pixel space bridging perception to action and reward. These transformations reshape axes to reflect relevant high-level features and strip away information about task-irrelevant sensory features. Our findings shed light on the neural encoding of task representations for decision-making in real-world situations. •Naturalistic decision-making tasks modeled by a deep Q-network (DQN)•Task representations encoded in dorsal visual pathway and posterior parietal cortex•Computational principles common to both DQN and human brain are characterized Cross et al. scanned humans playing Atari games and utilized a deep reinforcement learning algorithm as a model for how humans can map high-dimensional sensory inputs in actions. Representations in the intermediate layers of the algorithm were used to predict behavior and neural activity throughout a sensorimotor pathway.
ISSN:0896-6273
1097-4199
DOI:10.1016/j.neuron.2020.11.021