Loading…

Grounding deep neural network predictions of human categorization behavior in understandable functional features: The case of face identity

Deep neural networks (DNNs) can resolve real-world categorization tasks with apparent human-level performance. However, true equivalence of behavioral performance between humans and their DNN models requires that their internal mechanisms process equivalent features of the stimulus. To develop such...

Full description

Saved in:

Bibliographic Details
Published in:	Patterns (New York, N.Y.) N.Y.), 2021-10, Vol.2 (10), p.100348-100348, Article 100348
Main Authors:	Daube, Christoph, Xu, Tian, Zhan, Jiayu, Webb, Andrew, Ince, Robin A.A., Garrod, Oliver G.B., Schyns, Philippe G.
Format:	Article
Language:	English
Subjects:	autoencoder categorization deep neural networks face generalization information theory reverse correlation shape texture visual cognition
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep neural networks (DNNs) can resolve real-world categorization tasks with apparent human-level performance. However, true equivalence of behavioral performance between humans and their DNN models requires that their internal mechanisms process equivalent features of the stimulus. To develop such feature equivalence, our methodology leveraged an interpretable and experimentally controlled generative model of the stimuli (realistic three-dimensional textured faces). Humans rated the similarity of randomly generated faces to four familiar identities. We predicted these similarity ratings from the activations of five DNNs trained with different optimization objectives. Using information theoretic redundancy, reverse correlation, and the testing of generalization gradients, we show that DNN predictions of human behavior improve because their shape and texture features overlap with those that subsume human behavior. Thus, we must equate the functional features that subsume the behavioral performances of the brain and its models before comparing where, when, and how these features are processed. ▪ •DNNs modeled how humans rate the similarity of familiar faces to random face stimuli•A generative model controlled the shape and texture features of the face stimuli•The best DNN predicted human behavior because it used similar face-shape features•Explaining human behavior from causal features is difficult with naturalistic images Deep neural networks (DNNs) are often presented as “the best model” of human perception, achieving or even exceeding “human-level performance.” However, it remains difficult to describe what information these DNNs process from their inputs to produce their decisions. In naturalistic images, multiple cues can lead to the same decision. For example, a DNN can identify Peter's face from his darker eyebrows or high cheekbones. However, a human knowing Peter could identify his same face with similar accuracy, but using different features (e.g. his chin or hairstyle). Decision accuracy thus only tells the visible part of the story. The hidden part is the specific information processed to decide. To address this, we compared DNNs that predicted human face identity decisions to varying faces generated with a computer graphics program. With such controlled stimuli, we revealed the hidden part of the specific face information that caused the same behavioral decisions in humans and DNNs. With a set of five DNNs, Daube et al. model the behavior of n =
ISSN:	2666-3899 2666-3899
DOI:	10.1016/j.patter.2021.100348