Loading…
Automatic image annotation: the quirks and what works
Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been propos...
Saved in:
Published in: | Multimedia tools and applications 2018-12, Vol.77 (24), p.31991-32011 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963 |
---|---|
cites | cdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963 |
container_end_page | 32011 |
container_issue | 24 |
container_start_page | 31991 |
container_title | Multimedia tools and applications |
container_volume | 77 |
creator | Dutta, Ayushi Verma, Yashaswi Jawahar, C. V. |
description | Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain. |
doi_str_mv | 10.1007/s11042-018-6247-3 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2055364135</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2055364135</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</originalsourceid><addsrcrecordid>eNp1kEtLAzEUhYMoWKs_wN2A6-i9eUym7krxBQU3ug5JzLRT7aRNMhT_vSkjuHJ1H5xzLvcj5BrhFgHUXUIEwShgQ2smFOUnZIJScaoUw9PS8waokoDn5CKlDQDWkokJkfMhh63Jnau6rVn5yvR9yGUO_X2V177aD138TGX9UR3WJleHUMZLctaar-SvfuuUvD8-vC2e6fL16WUxX1LHsc6UtUq0VrbOC8ctb5iYMeetnAnLZWsca1mDNYBVzFmjaucsawwDIyU6mNV8Sm7G3F0M-8GnrDdhiH05qRlIyWuBXBYVjioXQ0rRt3oXyzPxWyPoIx090tGFjj7S0bx42OhJRduvfPxL_t_0A2VGZmU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2055364135</pqid></control><display><type>article</type><title>Automatic image annotation: the quirks and what works</title><source>ABI/INFORM global</source><source>Springer Nature</source><creator>Dutta, Ayushi ; Verma, Yashaswi ; Jawahar, C. V.</creator><creatorcontrib>Dutta, Ayushi ; Verma, Yashaswi ; Jawahar, C. V.</creatorcontrib><description>Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain.</description><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-018-6247-3</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Annotations ; Computer Communication Networks ; Computer Science ; Computer vision ; Data Structures and Information Theory ; Deep learning ; Empirical analysis ; Image annotation ; Labels ; Machine learning ; Multimedia Information Systems ; Semantics ; Special Purpose and Application-Based Systems</subject><ispartof>Multimedia tools and applications, 2018-12, Vol.77 (24), p.31991-32011</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2018</rights><rights>Multimedia Tools and Applications is a copyright of Springer, (2018). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</citedby><cites>FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2055364135/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2055364135?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74895</link.rule.ids></links><search><creatorcontrib>Dutta, Ayushi</creatorcontrib><creatorcontrib>Verma, Yashaswi</creatorcontrib><creatorcontrib>Jawahar, C. V.</creatorcontrib><title>Automatic image annotation: the quirks and what works</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain.</description><subject>Annotations</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Data Structures and Information Theory</subject><subject>Deep learning</subject><subject>Empirical analysis</subject><subject>Image annotation</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Multimedia Information Systems</subject><subject>Semantics</subject><subject>Special Purpose and Application-Based Systems</subject><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp1kEtLAzEUhYMoWKs_wN2A6-i9eUym7krxBQU3ug5JzLRT7aRNMhT_vSkjuHJ1H5xzLvcj5BrhFgHUXUIEwShgQ2smFOUnZIJScaoUw9PS8waokoDn5CKlDQDWkokJkfMhh63Jnau6rVn5yvR9yGUO_X2V177aD138TGX9UR3WJleHUMZLctaar-SvfuuUvD8-vC2e6fL16WUxX1LHsc6UtUq0VrbOC8ctb5iYMeetnAnLZWsca1mDNYBVzFmjaucsawwDIyU6mNV8Sm7G3F0M-8GnrDdhiH05qRlIyWuBXBYVjioXQ0rRt3oXyzPxWyPoIx090tGFjj7S0bx42OhJRduvfPxL_t_0A2VGZmU</recordid><startdate>20181201</startdate><enddate>20181201</enddate><creator>Dutta, Ayushi</creator><creator>Verma, Yashaswi</creator><creator>Jawahar, C. V.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope></search><sort><creationdate>20181201</creationdate><title>Automatic image annotation: the quirks and what works</title><author>Dutta, Ayushi ; Verma, Yashaswi ; Jawahar, C. V.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Annotations</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Data Structures and Information Theory</topic><topic>Deep learning</topic><topic>Empirical analysis</topic><topic>Image annotation</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Multimedia Information Systems</topic><topic>Semantics</topic><topic>Special Purpose and Application-Based Systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dutta, Ayushi</creatorcontrib><creatorcontrib>Verma, Yashaswi</creatorcontrib><creatorcontrib>Jawahar, C. V.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest_ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer science database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM global</collection><collection>Computing Database</collection><collection>ProQuest_Research Library</collection><collection>Research Library (Corporate)</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dutta, Ayushi</au><au>Verma, Yashaswi</au><au>Jawahar, C. V.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic image annotation: the quirks and what works</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2018-12-01</date><risdate>2018</risdate><volume>77</volume><issue>24</issue><spage>31991</spage><epage>32011</epage><pages>31991-32011</pages><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>Automatic image annotation is one of the fundamental problems in computer vision and machine learning. Given an image, here the goal is to predict a set of textual labels that describe the semantics of that image. During the last decade, a large number of image annotation techniques have been proposed that have been shown to achieve encouraging results on various annotation datasets. However, their scope has mostly remained restricted to quantitative results on the test data, thus ignoring various key aspects related to dataset properties and evaluation metrics that inherently affect the performance to a considerable extent. In this paper, first we evaluate ten state-of-the-art (both deep-learning based as well as non-deep-learning based) approaches for image annotation using the same baseline CNN features. Then we propose new quantitative measures to examine various issues/aspects in the image annotation domain, such as dataset specific biases, per-label versus per-image evaluation criteria, and the impact of changing the number and type of predicted labels. We believe the conclusions derived in this paper through thorough empirical analyzes would be helpful in making systematic advancements in this domain.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-018-6247-3</doi><tpages>21</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1380-7501 |
ispartof | Multimedia tools and applications, 2018-12, Vol.77 (24), p.31991-32011 |
issn | 1380-7501 1573-7721 |
language | eng |
recordid | cdi_proquest_journals_2055364135 |
source | ABI/INFORM global; Springer Nature |
subjects | Annotations Computer Communication Networks Computer Science Computer vision Data Structures and Information Theory Deep learning Empirical analysis Image annotation Labels Machine learning Multimedia Information Systems Semantics Special Purpose and Application-Based Systems |
title | Automatic image annotation: the quirks and what works |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T02%3A08%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20image%20annotation:%20the%20quirks%20and%20what%20works&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Dutta,%20Ayushi&rft.date=2018-12-01&rft.volume=77&rft.issue=24&rft.spage=31991&rft.epage=32011&rft.pages=31991-32011&rft.issn=1380-7501&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-018-6247-3&rft_dat=%3Cproquest_cross%3E2055364135%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c316t-2f74fb5fce4c3b382492ceb594b35fac2f281600b72cba76ccb28a20a551c0963%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2055364135&rft_id=info:pmid/&rfr_iscdi=true |