Loading…

A visual embedding for the unsupervised extraction of abstract semantics

Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large...

Full description

Saved in:
Bibliographic Details
Published in:Cognitive systems research 2017-05, Vol.42, p.73-81
Main Authors: Garcia-Gasulla, D., Ayguadé, E., Labarta, J., Béjar, J., Cortés, U., Suzumura, T., Chen, R.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c394t-10fd9f1ece9a878b1f76162062db87563d500465b5bb2cb86d2e66f9b8f6f9c93
cites cdi_FETCH-LOGICAL-c394t-10fd9f1ece9a878b1f76162062db87563d500465b5bb2cb86d2e66f9b8f6f9c93
container_end_page 81
container_issue
container_start_page 73
container_title Cognitive systems research
container_volume 42
creator Garcia-Gasulla, D.
Ayguadé, E.
Labarta, J.
Béjar, J.
Cortés, U.
Suzumura, T.
Chen, R.
description Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g., 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g., living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used.
doi_str_mv 10.1016/j.cogsys.2016.11.008
format article
fullrecord <record><control><sourceid>csuc_cross</sourceid><recordid>TN_cdi_csuc_recercat_oai_recercat_cat_2072_272414</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1389041716300444</els_id><sourcerecordid>oai_recercat_cat_2072_272414</sourcerecordid><originalsourceid>FETCH-LOGICAL-c394t-10fd9f1ece9a878b1f76162062db87563d500465b5bb2cb86d2e66f9b8f6f9c93</originalsourceid><addsrcrecordid>eNp9UMtOwzAQtBBIlMIfcPAPJHidxEkuSFUFFKkSFzhbfqyLqzap7KSif49LK8GJwz5GuzOrHULugeXAQDysc9Ov4iHmPKEcIGesuSATKJo2YyXUl3_6a3IT45qlxbbiE7KY0b2Po9pQ3Gq01ncr6vpAh0-kYxfHHYY0R0vxawjKDL7vaO-o0vEH0ohb1Q3exFty5dQm4t25TsnH89P7fJEt315e57NlZoq2HDJgzrYO0GCrmrrR4GoBgjPBrW7qShS2YqwUla605kY3wnIUwrW6cSmbtpgSOOmaOBoZklAwapC98r_gGJzVXPKal1AmTnnmhD7GgE7ugt-qcJDA5NFBuZYnB-XRQQkgk4OJ9niiYfpn7zHIaDx2Bq1PpwZpe_-_wDeU73zL</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A visual embedding for the unsupervised extraction of abstract semantics</title><source>ScienceDirect Freedom Collection 2022-2024</source><creator>Garcia-Gasulla, D. ; Ayguadé, E. ; Labarta, J. ; Béjar, J. ; Cortés, U. ; Suzumura, T. ; Chen, R.</creator><creatorcontrib>Garcia-Gasulla, D. ; Ayguadé, E. ; Labarta, J. ; Béjar, J. ; Cortés, U. ; Suzumura, T. ; Chen, R.</creatorcontrib><description>Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g., 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g., living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used.</description><identifier>ISSN: 1389-0417</identifier><identifier>EISSN: 1389-0417</identifier><identifier>DOI: 10.1016/j.cogsys.2016.11.008</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Aprenentatge cognitiu ; Artificial image cognition ; Cognitive learning ; Deep learning embeddings ; Ensenyament i aprenentatge ; Metodologies docents ; Visual reasoning ; Àrees temàtiques de la UPC</subject><ispartof>Cognitive systems research, 2017-05, Vol.42, p.73-81</ispartof><rights>2016 Elsevier B.V.</rights><rights>Attribution-NonCommercial-NoDerivs 4.0 International License https://creativecommons.org/licenses/by-nc-nd/4.0/ info:eu-repo/semantics/openAccess</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c394t-10fd9f1ece9a878b1f76162062db87563d500465b5bb2cb86d2e66f9b8f6f9c93</citedby><cites>FETCH-LOGICAL-c394t-10fd9f1ece9a878b1f76162062db87563d500465b5bb2cb86d2e66f9b8f6f9c93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids></links><search><creatorcontrib>Garcia-Gasulla, D.</creatorcontrib><creatorcontrib>Ayguadé, E.</creatorcontrib><creatorcontrib>Labarta, J.</creatorcontrib><creatorcontrib>Béjar, J.</creatorcontrib><creatorcontrib>Cortés, U.</creatorcontrib><creatorcontrib>Suzumura, T.</creatorcontrib><creatorcontrib>Chen, R.</creatorcontrib><title>A visual embedding for the unsupervised extraction of abstract semantics</title><title>Cognitive systems research</title><description>Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g., 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g., living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used.</description><subject>Aprenentatge cognitiu</subject><subject>Artificial image cognition</subject><subject>Cognitive learning</subject><subject>Deep learning embeddings</subject><subject>Ensenyament i aprenentatge</subject><subject>Metodologies docents</subject><subject>Visual reasoning</subject><subject>Àrees temàtiques de la UPC</subject><issn>1389-0417</issn><issn>1389-0417</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNp9UMtOwzAQtBBIlMIfcPAPJHidxEkuSFUFFKkSFzhbfqyLqzap7KSif49LK8GJwz5GuzOrHULugeXAQDysc9Ov4iHmPKEcIGesuSATKJo2YyXUl3_6a3IT45qlxbbiE7KY0b2Po9pQ3Gq01ncr6vpAh0-kYxfHHYY0R0vxawjKDL7vaO-o0vEH0ohb1Q3exFty5dQm4t25TsnH89P7fJEt315e57NlZoq2HDJgzrYO0GCrmrrR4GoBgjPBrW7qShS2YqwUla605kY3wnIUwrW6cSmbtpgSOOmaOBoZklAwapC98r_gGJzVXPKal1AmTnnmhD7GgE7ugt-qcJDA5NFBuZYnB-XRQQkgk4OJ9niiYfpn7zHIaDx2Bq1PpwZpe_-_wDeU73zL</recordid><startdate>201705</startdate><enddate>201705</enddate><creator>Garcia-Gasulla, D.</creator><creator>Ayguadé, E.</creator><creator>Labarta, J.</creator><creator>Béjar, J.</creator><creator>Cortés, U.</creator><creator>Suzumura, T.</creator><creator>Chen, R.</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>XX2</scope></search><sort><creationdate>201705</creationdate><title>A visual embedding for the unsupervised extraction of abstract semantics</title><author>Garcia-Gasulla, D. ; Ayguadé, E. ; Labarta, J. ; Béjar, J. ; Cortés, U. ; Suzumura, T. ; Chen, R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c394t-10fd9f1ece9a878b1f76162062db87563d500465b5bb2cb86d2e66f9b8f6f9c93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Aprenentatge cognitiu</topic><topic>Artificial image cognition</topic><topic>Cognitive learning</topic><topic>Deep learning embeddings</topic><topic>Ensenyament i aprenentatge</topic><topic>Metodologies docents</topic><topic>Visual reasoning</topic><topic>Àrees temàtiques de la UPC</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Garcia-Gasulla, D.</creatorcontrib><creatorcontrib>Ayguadé, E.</creatorcontrib><creatorcontrib>Labarta, J.</creatorcontrib><creatorcontrib>Béjar, J.</creatorcontrib><creatorcontrib>Cortés, U.</creatorcontrib><creatorcontrib>Suzumura, T.</creatorcontrib><creatorcontrib>Chen, R.</creatorcontrib><collection>CrossRef</collection><collection>Recercat</collection><jtitle>Cognitive systems research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Garcia-Gasulla, D.</au><au>Ayguadé, E.</au><au>Labarta, J.</au><au>Béjar, J.</au><au>Cortés, U.</au><au>Suzumura, T.</au><au>Chen, R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A visual embedding for the unsupervised extraction of abstract semantics</atitle><jtitle>Cognitive systems research</jtitle><date>2017-05</date><risdate>2017</risdate><volume>42</volume><spage>73</spage><epage>81</epage><pages>73-81</pages><issn>1389-0417</issn><eissn>1389-0417</eissn><abstract>Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g., 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g., living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.cogsys.2016.11.008</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1389-0417
ispartof Cognitive systems research, 2017-05, Vol.42, p.73-81
issn 1389-0417
1389-0417
language eng
recordid cdi_csuc_recercat_oai_recercat_cat_2072_272414
source ScienceDirect Freedom Collection 2022-2024
subjects Aprenentatge cognitiu
Artificial image cognition
Cognitive learning
Deep learning embeddings
Ensenyament i aprenentatge
Metodologies docents
Visual reasoning
Àrees temàtiques de la UPC
title A visual embedding for the unsupervised extraction of abstract semantics
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T08%3A35%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20visual%20embedding%20for%20the%20unsupervised%20extraction%20of%20abstract%20semantics&rft.jtitle=Cognitive%20systems%20research&rft.au=Garcia-Gasulla,%20D.&rft.date=2017-05&rft.volume=42&rft.spage=73&rft.epage=81&rft.pages=73-81&rft.issn=1389-0417&rft.eissn=1389-0417&rft_id=info:doi/10.1016/j.cogsys.2016.11.008&rft_dat=%3Ccsuc_cross%3Eoai_recercat_cat_2072_272414%3C/csuc_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c394t-10fd9f1ece9a878b1f76162062db87563d500465b5bb2cb86d2e66f9b8f6f9c93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true