Loading…

SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning

Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguish...

Full description

Saved in:

Bibliographic Details
Main Authors:	Li, Chenge, Fehervari, Istvan, Zhao, Xiaonan, Macedo, Ives, Appalaraju, Srikar
Format:	Conference Proceeding
Language:	English
Subjects:	Computer vision Feature extraction Image/Video Indexing and Retrieval Large-scale Vision Applications Object Detection/Recognition/Categorization Transfer Few-shot Semi- and Un- supervised Learning Vision and Languages Measurement Predictive models Text recognition Training Visualization
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	596
container_issue
container_start_page	587
container_title
container_volume
creator	Li, Chenge Fehervari, Istvan Zhao, Xiaonan Macedo, Ives Appalaraju, Srikar
description	Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.
doi_str_mv	10.1109/WACV51458.2022.00066
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9706752</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9706752</ieee_id><sourcerecordid>9706752</sourcerecordid><originalsourceid>FETCH-LOGICAL-i203t-4e21f556a223c86bb6d0d07de014cfbe2de5edaac37cbda0e9cd34bd289961bb3</originalsourceid><addsrcrecordid>eNotzLtOwzAUAFCDhERb-AIY_AMu13bsxGxRxEsKqtSGMlZ-3ARDSSonUunfM8B0tkPILYcl52Du3stqq3imiqUAIZYAoPUZmXOtVQaGKzgnM6EzwYws-CWZj-MngDTcyBlZbxAb_LqnW0wnWtvUIdt4u0e6OmDPRpxoPXQDXaMfuj5OcejpMU4ftMGfiZVHm5C-4pSipzXa1Me-uyIXrd2PeP3vgrw9PjTVM6tXTy9VWbMoQE4sQ8FbpbQVQvpCO6cDBMgDAs9861AEVBis9TL3LlhA44PMXBCFMZo7Jxfk5u-NiLg7pPht02lnctC5EvIXk-9QUg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</title><source>IEEE Xplore All Conference Series</source><creator>Li, Chenge ; Fehervari, Istvan ; Zhao, Xiaonan ; Macedo, Ives ; Appalaraju, Srikar</creator><creatorcontrib>Li, Chenge ; Fehervari, Istvan ; Zhao, Xiaonan ; Macedo, Ives ; Appalaraju, Srikar</creatorcontrib><description>Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.</description><identifier>EISSN: 2642-9381</identifier><identifier>EISBN: 1665409150</identifier><identifier>EISBN: 9781665409155</identifier><identifier>DOI: 10.1109/WACV51458.2022.00066</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computer vision ; Feature extraction ; Image/Video Indexing and Retrieval ; Large-scale Vision Applications; Object Detection/Recognition/Categorization; Transfer; Few-shot; Semi- and Un- supervised Learning; Vision and Languages ; Measurement ; Predictive models ; Text recognition ; Training ; Visualization</subject><ispartof>2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, p.587-596</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9706752$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,23930,23931,25140,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9706752$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Chenge</creatorcontrib><creatorcontrib>Fehervari, Istvan</creatorcontrib><creatorcontrib>Zhao, Xiaonan</creatorcontrib><creatorcontrib>Macedo, Ives</creatorcontrib><creatorcontrib>Appalaraju, Srikar</creatorcontrib><title>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</title><title>2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</title><addtitle>WACV</addtitle><description>Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.</description><subject>Computer vision</subject><subject>Feature extraction</subject><subject>Image/Video Indexing and Retrieval ; Large-scale Vision Applications; Object Detection/Recognition/Categorization; Transfer; Few-shot; Semi- and Un- supervised Learning; Vision and Languages</subject><subject>Measurement</subject><subject>Predictive models</subject><subject>Text recognition</subject><subject>Training</subject><subject>Visualization</subject><issn>2642-9381</issn><isbn>1665409150</isbn><isbn>9781665409155</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2022</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotzLtOwzAUAFCDhERb-AIY_AMu13bsxGxRxEsKqtSGMlZ-3ARDSSonUunfM8B0tkPILYcl52Du3stqq3imiqUAIZYAoPUZmXOtVQaGKzgnM6EzwYws-CWZj-MngDTcyBlZbxAb_LqnW0wnWtvUIdt4u0e6OmDPRpxoPXQDXaMfuj5OcejpMU4ftMGfiZVHm5C-4pSipzXa1Me-uyIXrd2PeP3vgrw9PjTVM6tXTy9VWbMoQE4sQ8FbpbQVQvpCO6cDBMgDAs9861AEVBis9TL3LlhA44PMXBCFMZo7Jxfk5u-NiLg7pPht02lnctC5EvIXk-9QUg</recordid><startdate>202201</startdate><enddate>202201</enddate><creator>Li, Chenge</creator><creator>Fehervari, Istvan</creator><creator>Zhao, Xiaonan</creator><creator>Macedo, Ives</creator><creator>Appalaraju, Srikar</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>202201</creationdate><title>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</title><author>Li, Chenge ; Fehervari, Istvan ; Zhao, Xiaonan ; Macedo, Ives ; Appalaraju, Srikar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i203t-4e21f556a223c86bb6d0d07de014cfbe2de5edaac37cbda0e9cd34bd289961bb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer vision</topic><topic>Feature extraction</topic><topic>Image/Video Indexing and Retrieval ; Large-scale Vision Applications; Object Detection/Recognition/Categorization; Transfer; Few-shot; Semi- and Un- supervised Learning; Vision and Languages</topic><topic>Measurement</topic><topic>Predictive models</topic><topic>Text recognition</topic><topic>Training</topic><topic>Visualization</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Chenge</creatorcontrib><creatorcontrib>Fehervari, Istvan</creatorcontrib><creatorcontrib>Zhao, Xiaonan</creatorcontrib><creatorcontrib>Macedo, Ives</creatorcontrib><creatorcontrib>Appalaraju, Srikar</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Chenge</au><au>Fehervari, Istvan</au><au>Zhao, Xiaonan</au><au>Macedo, Ives</au><au>Appalaraju, Srikar</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</atitle><btitle>2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</btitle><stitle>WACV</stitle><date>2022-01</date><risdate>2022</risdate><spage>587</spage><epage>596</epage><pages>587-596</pages><eissn>2642-9381</eissn><eisbn>1665409150</eisbn><eisbn>9781665409155</eisbn><coden>IEEPAD</coden><abstract>Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.</abstract><pub>IEEE</pub><doi>10.1109/WACV51458.2022.00066</doi><tpages>10</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2642-9381
ispartof	2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, p.587-596
issn	2642-9381
language	eng
recordid	cdi_ieee_primary_9706752
source	IEEE Xplore All Conference Series
subjects	Computer vision Feature extraction Image/Video Indexing and Retrieval Large-scale Vision Applications Object Detection/Recognition/Categorization Transfer Few-shot Semi- and Un- supervised Learning Vision and Languages Measurement Predictive models Text recognition Training Visualization
title	SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T09%3A53%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=SeeTek:%20Very%20Large-Scale%20Open-set%20Logo%20Recognition%20with%20Text-Aware%20Metric%20Learning&rft.btitle=2022%20IEEE/CVF%20Winter%20Conference%20on%20Applications%20of%20Computer%20Vision%20(WACV)&rft.au=Li,%20Chenge&rft.date=2022-01&rft.spage=587&rft.epage=596&rft.pages=587-596&rft.eissn=2642-9381&rft.coden=IEEPAD&rft_id=info:doi/10.1109/WACV51458.2022.00066&rft.eisbn=1665409150&rft.eisbn_list=9781665409155&rft_dat=%3Cieee_CHZPO%3E9706752%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i203t-4e21f556a223c86bb6d0d07de014cfbe2de5edaac37cbda0e9cd34bd289961bb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9706752&rfr_iscdi=true