Loading…

Layerwise complexity-matched learning yields an improved model of cortical area V2

Human ability to recognize complex visual patterns arises through transformations performed by successive areas in the ventral visual cortex. Deep neural networks trained end-to-end for object recognition approach human capabilities, and offer the best descriptions to date of neural responses in the...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-07
Main Authors:	Parthasarathy, Nikhil, Hénaff, Olivier J, Simoncelli, Eero P
Format:	Article
Language:	English
Subjects:	Alignment Artificial neural networks Back propagation networks Deformation Human performance Machine learning Object recognition Task complexity
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Parthasarathy, Nikhil Hénaff, Olivier J Simoncelli, Eero P
description	Human ability to recognize complex visual patterns arises through transformations performed by successive areas in the ventral visual cortex. Deep neural networks trained end-to-end for object recognition approach human capabilities, and offer the best descriptions to date of neural responses in the late stages of the hierarchy. But these networks provide a poor account of the early stages, compared to traditional hand-engineered models, or models optimized for coding efficiency or prediction. Moreover, the gradient backpropagation used in end-to-end learning is generally considered to be biologically implausible. Here, we overcome both of these limitations by developing a bottom-up self-supervised training methodology that operates independently on successive layers. Specifically, we maximize feature similarity between pairs of locally-deformed natural image patches, while decorrelating features across patches sampled from other images. Crucially, the deformation amplitudes are adjusted proportionally to receptive field sizes in each layer, thus matching the task complexity to the capacity at each stage of processing. In comparison with architecture-matched versions of previous models, we demonstrate that our layerwise complexity-matched learning (LCL) formulation produces a two-stage model (LCL-V2) that is better aligned with selectivity properties and neural activity in primate area V2. We demonstrate that the complexity-matched learning paradigm is responsible for much of the emergence of the improved biological alignment. Finally, when the two-stage model is used as a fixed front-end for a deep network trained to perform object recognition, the resultant model (LCL-V2Net) is significantly better than standard end-to-end self-supervised, supervised, and adversarially-trained models in terms of generalization to out-of-distribution tasks and alignment with human behavior.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2903732788</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2903732788</sourcerecordid><originalsourceid>FETCH-proquest_journals_29037327883</originalsourceid><addsrcrecordid>eNqNi7sKwjAUQIMgWLT_cMG5EBNr6yyKg5OIq1zaW03Joyap2r-3gx_gdIZzzoQlQspVVq6FmLE0hJZzLjaFyHOZsPMJB_JvFQgqZzpNHxWHzGCsHlSDJvRW2TsMinQdAC0o03n3Gp1xNWlwzfj5qCrUgJ4QrmLBpg3qQOmPc7Y87C-7YzaOz55CvLWu93ZUN7HlspCiKEv5X_UFKcE_5g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2903732788</pqid></control><display><type>article</type><title>Layerwise complexity-matched learning yields an improved model of cortical area V2</title><source>Publicly Available Content Database</source><creator>Parthasarathy, Nikhil ; Hénaff, Olivier J ; Simoncelli, Eero P</creator><creatorcontrib>Parthasarathy, Nikhil ; Hénaff, Olivier J ; Simoncelli, Eero P</creatorcontrib><description>Human ability to recognize complex visual patterns arises through transformations performed by successive areas in the ventral visual cortex. Deep neural networks trained end-to-end for object recognition approach human capabilities, and offer the best descriptions to date of neural responses in the late stages of the hierarchy. But these networks provide a poor account of the early stages, compared to traditional hand-engineered models, or models optimized for coding efficiency or prediction. Moreover, the gradient backpropagation used in end-to-end learning is generally considered to be biologically implausible. Here, we overcome both of these limitations by developing a bottom-up self-supervised training methodology that operates independently on successive layers. Specifically, we maximize feature similarity between pairs of locally-deformed natural image patches, while decorrelating features across patches sampled from other images. Crucially, the deformation amplitudes are adjusted proportionally to receptive field sizes in each layer, thus matching the task complexity to the capacity at each stage of processing. In comparison with architecture-matched versions of previous models, we demonstrate that our layerwise complexity-matched learning (LCL) formulation produces a two-stage model (LCL-V2) that is better aligned with selectivity properties and neural activity in primate area V2. We demonstrate that the complexity-matched learning paradigm is responsible for much of the emergence of the improved biological alignment. Finally, when the two-stage model is used as a fixed front-end for a deep network trained to perform object recognition, the resultant model (LCL-V2Net) is significantly better than standard end-to-end self-supervised, supervised, and adversarially-trained models in terms of generalization to out-of-distribution tasks and alignment with human behavior.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Alignment ; Artificial neural networks ; Back propagation networks ; Deformation ; Human performance ; Machine learning ; Object recognition ; Task complexity</subject><ispartof>arXiv.org, 2024-07</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2903732788?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Parthasarathy, Nikhil</creatorcontrib><creatorcontrib>Hénaff, Olivier J</creatorcontrib><creatorcontrib>Simoncelli, Eero P</creatorcontrib><title>Layerwise complexity-matched learning yields an improved model of cortical area V2</title><title>arXiv.org</title><description>Human ability to recognize complex visual patterns arises through transformations performed by successive areas in the ventral visual cortex. Deep neural networks trained end-to-end for object recognition approach human capabilities, and offer the best descriptions to date of neural responses in the late stages of the hierarchy. But these networks provide a poor account of the early stages, compared to traditional hand-engineered models, or models optimized for coding efficiency or prediction. Moreover, the gradient backpropagation used in end-to-end learning is generally considered to be biologically implausible. Here, we overcome both of these limitations by developing a bottom-up self-supervised training methodology that operates independently on successive layers. Specifically, we maximize feature similarity between pairs of locally-deformed natural image patches, while decorrelating features across patches sampled from other images. Crucially, the deformation amplitudes are adjusted proportionally to receptive field sizes in each layer, thus matching the task complexity to the capacity at each stage of processing. In comparison with architecture-matched versions of previous models, we demonstrate that our layerwise complexity-matched learning (LCL) formulation produces a two-stage model (LCL-V2) that is better aligned with selectivity properties and neural activity in primate area V2. We demonstrate that the complexity-matched learning paradigm is responsible for much of the emergence of the improved biological alignment. Finally, when the two-stage model is used as a fixed front-end for a deep network trained to perform object recognition, the resultant model (LCL-V2Net) is significantly better than standard end-to-end self-supervised, supervised, and adversarially-trained models in terms of generalization to out-of-distribution tasks and alignment with human behavior.</description><subject>Alignment</subject><subject>Artificial neural networks</subject><subject>Back propagation networks</subject><subject>Deformation</subject><subject>Human performance</subject><subject>Machine learning</subject><subject>Object recognition</subject><subject>Task complexity</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNi7sKwjAUQIMgWLT_cMG5EBNr6yyKg5OIq1zaW03Joyap2r-3gx_gdIZzzoQlQspVVq6FmLE0hJZzLjaFyHOZsPMJB_JvFQgqZzpNHxWHzGCsHlSDJvRW2TsMinQdAC0o03n3Gp1xNWlwzfj5qCrUgJ4QrmLBpg3qQOmPc7Y87C-7YzaOz55CvLWu93ZUN7HlspCiKEv5X_UFKcE_5g</recordid><startdate>20240718</startdate><enddate>20240718</enddate><creator>Parthasarathy, Nikhil</creator><creator>Hénaff, Olivier J</creator><creator>Simoncelli, Eero P</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240718</creationdate><title>Layerwise complexity-matched learning yields an improved model of cortical area V2</title><author>Parthasarathy, Nikhil ; Hénaff, Olivier J ; Simoncelli, Eero P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29037327883</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Alignment</topic><topic>Artificial neural networks</topic><topic>Back propagation networks</topic><topic>Deformation</topic><topic>Human performance</topic><topic>Machine learning</topic><topic>Object recognition</topic><topic>Task complexity</topic><toplevel>online_resources</toplevel><creatorcontrib>Parthasarathy, Nikhil</creatorcontrib><creatorcontrib>Hénaff, Olivier J</creatorcontrib><creatorcontrib>Simoncelli, Eero P</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Parthasarathy, Nikhil</au><au>Hénaff, Olivier J</au><au>Simoncelli, Eero P</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Layerwise complexity-matched learning yields an improved model of cortical area V2</atitle><jtitle>arXiv.org</jtitle><date>2024-07-18</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Human ability to recognize complex visual patterns arises through transformations performed by successive areas in the ventral visual cortex. Deep neural networks trained end-to-end for object recognition approach human capabilities, and offer the best descriptions to date of neural responses in the late stages of the hierarchy. But these networks provide a poor account of the early stages, compared to traditional hand-engineered models, or models optimized for coding efficiency or prediction. Moreover, the gradient backpropagation used in end-to-end learning is generally considered to be biologically implausible. Here, we overcome both of these limitations by developing a bottom-up self-supervised training methodology that operates independently on successive layers. Specifically, we maximize feature similarity between pairs of locally-deformed natural image patches, while decorrelating features across patches sampled from other images. Crucially, the deformation amplitudes are adjusted proportionally to receptive field sizes in each layer, thus matching the task complexity to the capacity at each stage of processing. In comparison with architecture-matched versions of previous models, we demonstrate that our layerwise complexity-matched learning (LCL) formulation produces a two-stage model (LCL-V2) that is better aligned with selectivity properties and neural activity in primate area V2. We demonstrate that the complexity-matched learning paradigm is responsible for much of the emergence of the improved biological alignment. Finally, when the two-stage model is used as a fixed front-end for a deep network trained to perform object recognition, the resultant model (LCL-V2Net) is significantly better than standard end-to-end self-supervised, supervised, and adversarially-trained models in terms of generalization to out-of-distribution tasks and alignment with human behavior.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-07
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2903732788
source	Publicly Available Content Database
subjects	Alignment Artificial neural networks Back propagation networks Deformation Human performance Machine learning Object recognition Task complexity
title	Layerwise complexity-matched learning yields an improved model of cortical area V2
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T18%3A43%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Layerwise%20complexity-matched%20learning%20yields%20an%20improved%20model%20of%20cortical%20area%20V2&rft.jtitle=arXiv.org&rft.au=Parthasarathy,%20Nikhil&rft.date=2024-07-18&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2903732788%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_29037327883%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2903732788&rft_id=info:pmid/&rfr_iscdi=true