Loading…
On the Realization of Compositionality in Neural Networks
We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the othe...
Saved in:
Published in: | arXiv.org 2019-06 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Baan, Joris Leible, Jana Mitja Nikolaus Rau, David Ulmer, Dennis Baumgärtner, Tim Hupkes, Dieuwke Bruni, Elia |
description | We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2236486597</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2236486597</sourcerecordid><originalsourceid>FETCH-proquest_journals_22364865973</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSw9M9TKMlIVQhKTczJrEosyczPU8hPU3DOzy3IL84EcYHiJZUKmXkKfqmlRYk5QKqkPL8ou5iHgTUtMac4lRdKczMou7mGOHvoFhTlF5amFpfEZ-WXFgG1F8cbGRmbmViYmVqaGxOnCgCLaTat</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2236486597</pqid></control><display><type>article</type><title>On the Realization of Compositionality in Neural Networks</title><source>Publicly Available Content Database</source><creator>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</creator><creatorcontrib>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</creatorcontrib><description>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Functional groups ; Lookup tables ; Mathematical models ; Modular structures ; Neural networks ; Neurons ; Parameters</subject><ispartof>arXiv.org, 2019-06</ispartof><rights>2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2236486597?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25732,36991,44569</link.rule.ids></links><search><creatorcontrib>Baan, Joris</creatorcontrib><creatorcontrib>Leible, Jana</creatorcontrib><creatorcontrib>Mitja Nikolaus</creatorcontrib><creatorcontrib>Rau, David</creatorcontrib><creatorcontrib>Ulmer, Dennis</creatorcontrib><creatorcontrib>Baumgärtner, Tim</creatorcontrib><creatorcontrib>Hupkes, Dieuwke</creatorcontrib><creatorcontrib>Bruni, Elia</creatorcontrib><title>On the Realization of Compositionality in Neural Networks</title><title>arXiv.org</title><description>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</description><subject>Functional groups</subject><subject>Lookup tables</subject><subject>Mathematical models</subject><subject>Modular structures</subject><subject>Neural networks</subject><subject>Neurons</subject><subject>Parameters</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSw9M9TKMlIVQhKTczJrEosyczPU8hPU3DOzy3IL84EcYHiJZUKmXkKfqmlRYk5QKqkPL8ou5iHgTUtMac4lRdKczMou7mGOHvoFhTlF5amFpfEZ-WXFgG1F8cbGRmbmViYmVqaGxOnCgCLaTat</recordid><startdate>20190606</startdate><enddate>20190606</enddate><creator>Baan, Joris</creator><creator>Leible, Jana</creator><creator>Mitja Nikolaus</creator><creator>Rau, David</creator><creator>Ulmer, Dennis</creator><creator>Baumgärtner, Tim</creator><creator>Hupkes, Dieuwke</creator><creator>Bruni, Elia</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20190606</creationdate><title>On the Realization of Compositionality in Neural Networks</title><author>Baan, Joris ; Leible, Jana ; Mitja Nikolaus ; Rau, David ; Ulmer, Dennis ; Baumgärtner, Tim ; Hupkes, Dieuwke ; Bruni, Elia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_22364865973</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Functional groups</topic><topic>Lookup tables</topic><topic>Mathematical models</topic><topic>Modular structures</topic><topic>Neural networks</topic><topic>Neurons</topic><topic>Parameters</topic><toplevel>online_resources</toplevel><creatorcontrib>Baan, Joris</creatorcontrib><creatorcontrib>Leible, Jana</creatorcontrib><creatorcontrib>Mitja Nikolaus</creatorcontrib><creatorcontrib>Rau, David</creatorcontrib><creatorcontrib>Ulmer, Dennis</creatorcontrib><creatorcontrib>Baumgärtner, Tim</creatorcontrib><creatorcontrib>Hupkes, Dieuwke</creatorcontrib><creatorcontrib>Bruni, Elia</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Baan, Joris</au><au>Leible, Jana</au><au>Mitja Nikolaus</au><au>Rau, David</au><au>Ulmer, Dennis</au><au>Baumgärtner, Tim</au><au>Hupkes, Dieuwke</au><au>Bruni, Elia</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>On the Realization of Compositionality in Neural Networks</atitle><jtitle>arXiv.org</jtitle><date>2019-06-06</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2019-06 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2236486597 |
source | Publicly Available Content Database |
subjects | Functional groups Lookup tables Mathematical models Modular structures Neural networks Neurons Parameters |
title | On the Realization of Compositionality in Neural Networks |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T23%3A08%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=On%20the%20Realization%20of%20Compositionality%20in%20Neural%20Networks&rft.jtitle=arXiv.org&rft.au=Baan,%20Joris&rft.date=2019-06-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2236486597%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_22364865973%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2236486597&rft_id=info:pmid/&rfr_iscdi=true |