Loading…

On the Realization of Compositionality in Neural Networks

We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the othe...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2019-06
Main Authors:	Baan, Joris, Leible, Jana, Mitja Nikolaus, Rau, David, Ulmer, Dennis, Baumgärtner, Tim, Hupkes, Dieuwke, Bruni, Elia
Format:	Article
Language:	English
Subjects:	Functional groups Lookup tables Mathematical models Modular structures Neural networks Neurons Parameters
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Baan, Joris Leible, Jana Mitja Nikolaus Rau, David Ulmer, Dennis Baumgärtner, Tim Hupkes, Dieuwke Bruni, Elia
description	We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2236486597</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2236486597</sourcerecordid><originalsourceid>FETCH-proquest_journals_22364865973</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSw9M9TKMlIVQhKTczJrEosyczPU8hPU3DOzy3IL84EcYHiJZUKmXkKfqmlRYk5QKqkPL8ou5iHgTUtMac4lRdKczMou7mGOHvoFhTlF5amFpfEZ-WXFgG1F8cbGRmbmViYmVqaGxOnCgCLaTat</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2236486597</pqid></control><display><type>article</type><title>On the Realization of Compositionality in Neural Networks</title><source>Publicly Available Content Database</source><creator>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</creator><creatorcontrib>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</creatorcontrib><description>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Functional groups ; Lookup tables ; Mathematical models ; Modular structures ; Neural networks ; Neurons ; Parameters</subject><ispartof>arXiv.org, 2019-06</ispartof><rights>2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2236486597?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25732,36991,44569</link.rule.ids></links><search><creatorcontrib>Baan, Joris</creatorcontrib><creatorcontrib>Leible, Jana</creatorcontrib><creatorcontrib>Mitja Nikolaus</creatorcontrib><creatorcontrib>Rau, David</creatorcontrib><creatorcontrib>Ulmer, Dennis</creatorcontrib><creatorcontrib>Baumgärtner, Tim</creatorcontrib><creatorcontrib>Hupkes, Dieuwke</creatorcontrib><creatorcontrib>Bruni, Elia</creatorcontrib><title>On the Realization of Compositionality in Neural Networks</title><title>arXiv.org</title><description>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</description><subject>Functional groups</subject><subject>Lookup tables</subject><subject>Mathematical models</subject><subject>Modular structures</subject><subject>Neural networks</subject><subject>Neurons</subject><subject>Parameters</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSw9M9TKMlIVQhKTczJrEosyczPU8hPU3DOzy3IL84EcYHiJZUKmXkKfqmlRYk5QKqkPL8ou5iHgTUtMac4lRdKczMou7mGOHvoFhTlF5amFpfEZ-WXFgG1F8cbGRmbmViYmVqaGxOnCgCLaTat</recordid><startdate>20190606</startdate><enddate>20190606</enddate><creator>Baan, Joris</creator><creator>Leible, Jana</creator><creator>Mitja Nikolaus</creator><creator>Rau, David</creator><creator>Ulmer, Dennis</creator><creator>Baumgärtner, Tim</creator><creator>Hupkes, Dieuwke</creator><creator>Bruni, Elia</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20190606</creationdate><title>On the Realization of Compositionality in Neural Networks</title><author>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_22364865973</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Functional groups</topic><topic>Lookup tables</topic><topic>Mathematical models</topic><topic>Modular structures</topic><topic>Neural networks</topic><topic>Neurons</topic><topic>Parameters</topic><toplevel>online_resources</toplevel><creatorcontrib>Baan, Joris</creatorcontrib><creatorcontrib>Leible, Jana</creatorcontrib><creatorcontrib>Mitja Nikolaus</creatorcontrib><creatorcontrib>Rau, David</creatorcontrib><creatorcontrib>Ulmer, Dennis</creatorcontrib><creatorcontrib>Baumgärtner, Tim</creatorcontrib><creatorcontrib>Hupkes, Dieuwke</creatorcontrib><creatorcontrib>Bruni, Elia</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Baan, Joris</au><au>Leible, Jana</au><au>Mitja Nikolaus</au><au>Rau, David</au><au>Ulmer, Dennis</au><au>Baumgärtner, Tim</au><au>Hupkes, Dieuwke</au><au>Bruni, Elia</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>On the Realization of Compositionality in Neural Networks</atitle><jtitle>arXiv.org</jtitle><date>2019-06-06</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2019-06
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2236486597
source	Publicly Available Content Database
subjects	Functional groups Lookup tables Mathematical models Modular structures Neural networks Neurons Parameters
title	On the Realization of Compositionality in Neural Networks
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T23%3A08%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=On%20the%20Realization%20of%20Compositionality%20in%20Neural%20Networks&rft.jtitle=arXiv.org&rft.au=Baan,%20Joris&rft.date=2019-06-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2236486597%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_22364865973%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2236486597&rft_id=info:pmid/&rfr_iscdi=true