Loading…

Automatic image annotation: the quirks and what works

Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been propos...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications 2018-12, Vol.77 (24), p.31991-32011
Main Authors: Dutta, Ayushi, Verma, Yashaswi, Jawahar, C. V.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963
cites cdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963
container_end_page 32011
container_issue 24
container_start_page 31991
container_title Multimedia tools and applications
container_volume 77
creator Dutta, Ayushi
Verma, Yashaswi
Jawahar, C. V.
description Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain.
doi_str_mv 10.1007/s11042-018-6247-3
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2055364135</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2055364135</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</originalsourceid><addsrcrecordid>eNp1kEtLAzEUhYMoWKs_wN2A6-i9eUym7krxBQU3ug5JzLRT7aRNMhT_vSkjuHJ1H5xzLvcj5BrhFgHUXUIEwShgQ2smFOUnZIJScaoUw9PS8waokoDn5CKlDQDWkokJkfMhh63Jnau6rVn5yvR9yGUO_X2V177aD138TGX9UR3WJleHUMZLctaar-SvfuuUvD8-vC2e6fL16WUxX1LHsc6UtUq0VrbOC8ctb5iYMeetnAnLZWsca1mDNYBVzFmjaucsawwDIyU6mNV8Sm7G3F0M-8GnrDdhiH05qRlIyWuBXBYVjioXQ0rRt3oXyzPxWyPoIx090tGFjj7S0bx42OhJRduvfPxL_t_0A2VGZmU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2055364135</pqid></control><display><type>article</type><title>Automatic image annotation: the quirks and what works</title><source>ABI/INFORM global</source><source>Springer Nature</source><creator>Dutta, Ayushi ; Verma, Yashaswi ; Jawahar, C. V.</creator><creatorcontrib>Dutta, Ayushi ; Verma, Yashaswi ; Jawahar, C. V.</creatorcontrib><description>Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain.</description><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-018-6247-3</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Annotations ; Computer Communication Networks ; Computer Science ; Computer vision ; Data Structures and Information Theory ; Deep learning ; Empirical analysis ; Image annotation ; Labels ; Machine learning ; Multimedia Information Systems ; Semantics ; Special Purpose and Application-Based Systems</subject><ispartof>Multimedia tools and applications, 2018-12, Vol.77 (24), p.31991-32011</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2018</rights><rights>Multimedia Tools and Applications is a copyright of Springer, (2018). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</citedby><cites>FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2055364135/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2055364135?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74895</link.rule.ids></links><search><creatorcontrib>Dutta, Ayushi</creatorcontrib><creatorcontrib>Verma, Yashaswi</creatorcontrib><creatorcontrib>Jawahar, C. V.</creatorcontrib><title>Automatic image annotation: the quirks and what works</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain.</description><subject>Annotations</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Data Structures and Information Theory</subject><subject>Deep learning</subject><subject>Empirical analysis</subject><subject>Image annotation</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Multimedia Information Systems</subject><subject>Semantics</subject><subject>Special Purpose and Application-Based Systems</subject><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp1kEtLAzEUhYMoWKs_wN2A6-i9eUym7krxBQU3ug5JzLRT7aRNMhT_vSkjuHJ1H5xzLvcj5BrhFgHUXUIEwShgQ2smFOUnZIJScaoUw9PS8waokoDn5CKlDQDWkokJkfMhh63Jnau6rVn5yvR9yGUO_X2V177aD138TGX9UR3WJleHUMZLctaar-SvfuuUvD8-vC2e6fL16WUxX1LHsc6UtUq0VrbOC8ctb5iYMeetnAnLZWsca1mDNYBVzFmjaucsawwDIyU6mNV8Sm7G3F0M-8GnrDdhiH05qRlIyWuBXBYVjioXQ0rRt3oXyzPxWyPoIx090tGFjj7S0bx42OhJRduvfPxL_t_0A2VGZmU</recordid><startdate>20181201</startdate><enddate>20181201</enddate><creator>Dutta, Ayushi</creator><creator>Verma, Yashaswi</creator><creator>Jawahar, C. V.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope></search><sort><creationdate>20181201</creationdate><title>Automatic image annotation: the quirks and what works</title><author>Dutta, Ayushi ; Verma, Yashaswi ; Jawahar, C. V.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Annotations</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Data Structures and Information Theory</topic><topic>Deep learning</topic><topic>Empirical analysis</topic><topic>Image annotation</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Multimedia Information Systems</topic><topic>Semantics</topic><topic>Special Purpose and Application-Based Systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dutta, Ayushi</creatorcontrib><creatorcontrib>Verma, Yashaswi</creatorcontrib><creatorcontrib>Jawahar, C. V.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest_ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer science database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM global</collection><collection>Computing Database</collection><collection>ProQuest_Research Library</collection><collection>Research Library (Corporate)</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dutta, Ayushi</au><au>Verma, Yashaswi</au><au>Jawahar, C. V.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic image annotation: the quirks and what works</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2018-12-01</date><risdate>2018</risdate><volume>77</volume><issue>24</issue><spage>31991</spage><epage>32011</epage><pages>31991-32011</pages><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-018-6247-3</doi><tpages>21</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1380-7501
ispartof Multimedia tools and applications, 2018-12, Vol.77 (24), p.31991-32011
issn 1380-7501
1573-7721
language eng
recordid cdi_proquest_journals_2055364135
source ABI/INFORM global; Springer Nature
subjects Annotations
Computer Communication Networks
Computer Science
Computer vision
Data Structures and Information Theory
Deep learning
Empirical analysis
Image annotation
Labels
Machine learning
Multimedia Information Systems
Semantics
Special Purpose and Application-Based Systems
title Automatic image annotation: the quirks and what works
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T02%3A08%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20image%20annotation:%20the%20quirks%20and%20what%20works&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Dutta,%20Ayushi&rft.date=2018-12-01&rft.volume=77&rft.issue=24&rft.spage=31991&rft.epage=32011&rft.pages=31991-32011&rft.issn=1380-7501&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-018-6247-3&rft_dat=%3Cproquest_cross%3E2055364135%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2055364135&rft_id=info:pmid/&rfr_iscdi=true