Loading…

On the Realization of Compositionality in Neural Networks

We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the othe...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2019-06
Main Authors: Baan, Joris, Leible, Jana, Mitja Nikolaus, Rau, David, Ulmer, Dennis, Baumgärtner, Tim, Hupkes, Dieuwke, Bruni, Elia
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Baan, Joris
Leible, Jana
Mitja Nikolaus
Rau, David
Ulmer, Dennis
Baumgärtner, Tim
Hupkes, Dieuwke
Bruni, Elia
description We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2236486597</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2236486597</sourcerecordid><originalsourceid>FETCH-proquest_journals_22364865973</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSw9M9TKMlIVQhKTczJrEosyczPU8hPU3DOzy3IL84EcYHiJZUKmXkKfqmlRYk5QKqkPL8ou5iHgTUtMac4lRdKczMou7mGOHvoFhTlF5amFpfEZ-WXFgG1F8cbGRmbmViYmVqaGxOnCgCLaTat</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2236486597</pqid></control><display><type>article</type><title>On the Realization of Compositionality in Neural Networks</title><source>Publicly Available Content Database</source><creator>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</creator><creatorcontrib>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</creatorcontrib><description>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Functional groups ; Lookup tables ; Mathematical models ; Modular structures ; Neural networks ; Neurons ; Parameters</subject><ispartof>arXiv.org, 2019-06</ispartof><rights>2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2236486597?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25732,36991,44569</link.rule.ids></links><search><creatorcontrib>Baan, Joris</creatorcontrib><creatorcontrib>Leible, Jana</creatorcontrib><creatorcontrib>Mitja Nikolaus</creatorcontrib><creatorcontrib>Rau, David</creatorcontrib><creatorcontrib>Ulmer, Dennis</creatorcontrib><creatorcontrib>Baumgärtner, Tim</creatorcontrib><creatorcontrib>Hupkes, Dieuwke</creatorcontrib><creatorcontrib>Bruni, Elia</creatorcontrib><title>On the Realization of Compositionality in Neural Networks</title><title>arXiv.org</title><description>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</description><subject>Functional groups</subject><subject>Lookup tables</subject><subject>Mathematical models</subject><subject>Modular structures</subject><subject>Neural networks</subject><subject>Neurons</subject><subject>Parameters</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSw9M9TKMlIVQhKTczJrEosyczPU8hPU3DOzy3IL84EcYHiJZUKmXkKfqmlRYk5QKqkPL8ou5iHgTUtMac4lRdKczMou7mGOHvoFhTlF5amFpfEZ-WXFgG1F8cbGRmbmViYmVqaGxOnCgCLaTat</recordid><startdate>20190606</startdate><enddate>20190606</enddate><creator>Baan, Joris</creator><creator>Leible, Jana</creator><creator>Mitja Nikolaus</creator><creator>Rau, David</creator><creator>Ulmer, Dennis</creator><creator>Baumgärtner, Tim</creator><creator>Hupkes, Dieuwke</creator><creator>Bruni, Elia</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20190606</creationdate><title>On the Realization of Compositionality in Neural Networks</title><author>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_22364865973</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Functional groups</topic><topic>Lookup tables</topic><topic>Mathematical models</topic><topic>Modular structures</topic><topic>Neural networks</topic><topic>Neurons</topic><topic>Parameters</topic><toplevel>online_resources</toplevel><creatorcontrib>Baan, Joris</creatorcontrib><creatorcontrib>Leible, Jana</creatorcontrib><creatorcontrib>Mitja Nikolaus</creatorcontrib><creatorcontrib>Rau, David</creatorcontrib><creatorcontrib>Ulmer, Dennis</creatorcontrib><creatorcontrib>Baumgärtner, Tim</creatorcontrib><creatorcontrib>Hupkes, Dieuwke</creatorcontrib><creatorcontrib>Bruni, Elia</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Baan, Joris</au><au>Leible, Jana</au><au>Mitja Nikolaus</au><au>Rau, David</au><au>Ulmer, Dennis</au><au>Baumgärtner, Tim</au><au>Hupkes, Dieuwke</au><au>Bruni, Elia</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>On the Realization of Compositionality in Neural Networks</atitle><jtitle>arXiv.org</jtitle><date>2019-06-06</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2019-06
issn 2331-8422
language eng
recordid cdi_proquest_journals_2236486597
source Publicly Available Content Database
subjects Functional groups
Lookup tables
Mathematical models
Modular structures
Neural networks
Neurons
Parameters
title On the Realization of Compositionality in Neural Networks
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T23%3A08%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=On%20the%20Realization%20of%20Compositionality%20in%20Neural%20Networks&rft.jtitle=arXiv.org&rft.au=Baan,%20Joris&rft.date=2019-06-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2236486597%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_22364865973%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2236486597&rft_id=info:pmid/&rfr_iscdi=true