Loading…

SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning

Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguish...

Full description

Saved in:
Bibliographic Details
Main Authors: Li, Chenge, Fehervari, Istvan, Zhao, Xiaonan, Macedo, Ives, Appalaraju, Srikar
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 596
container_issue
container_start_page 587
container_title
container_volume
creator Li, Chenge
Fehervari, Istvan
Zhao, Xiaonan
Macedo, Ives
Appalaraju, Srikar
description Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.
doi_str_mv 10.1109/WACV51458.2022.00066
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9706752</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9706752</ieee_id><sourcerecordid>9706752</sourcerecordid><originalsourceid>FETCH-LOGICAL-i203t-4e21f556a223c86bb6d0d07de014cfbe2de5edaac37cbda0e9cd34bd289961bb3</originalsourceid><addsrcrecordid>eNotzLtOwzAUAFCDhERb-AIY_AMu13bsxGxRxEsKqtSGMlZ-3ARDSSonUunfM8B0tkPILYcl52Du3stqq3imiqUAIZYAoPUZmXOtVQaGKzgnM6EzwYws-CWZj-MngDTcyBlZbxAb_LqnW0wnWtvUIdt4u0e6OmDPRpxoPXQDXaMfuj5OcejpMU4ftMGfiZVHm5C-4pSipzXa1Me-uyIXrd2PeP3vgrw9PjTVM6tXTy9VWbMoQE4sQ8FbpbQVQvpCO6cDBMgDAs9861AEVBis9TL3LlhA44PMXBCFMZo7Jxfk5u-NiLg7pPht02lnctC5EvIXk-9QUg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</title><source>IEEE Xplore All Conference Series</source><creator>Li, Chenge ; Fehervari, Istvan ; Zhao, Xiaonan ; Macedo, Ives ; Appalaraju, Srikar</creator><creatorcontrib>Li, Chenge ; Fehervari, Istvan ; Zhao, Xiaonan ; Macedo, Ives ; Appalaraju, Srikar</creatorcontrib><description>Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.</description><identifier>EISSN: 2642-9381</identifier><identifier>EISBN: 1665409150</identifier><identifier>EISBN: 9781665409155</identifier><identifier>DOI: 10.1109/WACV51458.2022.00066</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Computer vision ; Feature extraction ; Image/Video Indexing and Retrieval ; Large-scale Vision Applications; Object Detection/Recognition/Categorization; Transfer; Few-shot; Semi- and Un- supervised Learning; Vision and Languages ; Measurement ; Predictive models ; Text recognition ; Training ; Visualization</subject><ispartof>2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, p.587-596</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9706752$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,23930,23931,25140,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9706752$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Chenge</creatorcontrib><creatorcontrib>Fehervari, Istvan</creatorcontrib><creatorcontrib>Zhao, Xiaonan</creatorcontrib><creatorcontrib>Macedo, Ives</creatorcontrib><creatorcontrib>Appalaraju, Srikar</creatorcontrib><title>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</title><title>2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</title><addtitle>WACV</addtitle><description>Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.</description><subject>Computer vision</subject><subject>Feature extraction</subject><subject>Image/Video Indexing and Retrieval ; Large-scale Vision Applications; Object Detection/Recognition/Categorization; Transfer; Few-shot; Semi- and Un- supervised Learning; Vision and Languages</subject><subject>Measurement</subject><subject>Predictive models</subject><subject>Text recognition</subject><subject>Training</subject><subject>Visualization</subject><issn>2642-9381</issn><isbn>1665409150</isbn><isbn>9781665409155</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2022</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotzLtOwzAUAFCDhERb-AIY_AMu13bsxGxRxEsKqtSGMlZ-3ARDSSonUunfM8B0tkPILYcl52Du3stqq3imiqUAIZYAoPUZmXOtVQaGKzgnM6EzwYws-CWZj-MngDTcyBlZbxAb_LqnW0wnWtvUIdt4u0e6OmDPRpxoPXQDXaMfuj5OcejpMU4ftMGfiZVHm5C-4pSipzXa1Me-uyIXrd2PeP3vgrw9PjTVM6tXTy9VWbMoQE4sQ8FbpbQVQvpCO6cDBMgDAs9861AEVBis9TL3LlhA44PMXBCFMZo7Jxfk5u-NiLg7pPht02lnctC5EvIXk-9QUg</recordid><startdate>202201</startdate><enddate>202201</enddate><creator>Li, Chenge</creator><creator>Fehervari, Istvan</creator><creator>Zhao, Xiaonan</creator><creator>Macedo, Ives</creator><creator>Appalaraju, Srikar</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>202201</creationdate><title>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</title><author>Li, Chenge ; Fehervari, Istvan ; Zhao, Xiaonan ; Macedo, Ives ; Appalaraju, Srikar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i203t-4e21f556a223c86bb6d0d07de014cfbe2de5edaac37cbda0e9cd34bd289961bb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer vision</topic><topic>Feature extraction</topic><topic>Image/Video Indexing and Retrieval ; Large-scale Vision Applications; Object Detection/Recognition/Categorization; Transfer; Few-shot; Semi- and Un- supervised Learning; Vision and Languages</topic><topic>Measurement</topic><topic>Predictive models</topic><topic>Text recognition</topic><topic>Training</topic><topic>Visualization</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Chenge</creatorcontrib><creatorcontrib>Fehervari, Istvan</creatorcontrib><creatorcontrib>Zhao, Xiaonan</creatorcontrib><creatorcontrib>Macedo, Ives</creatorcontrib><creatorcontrib>Appalaraju, Srikar</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Chenge</au><au>Fehervari, Istvan</au><au>Zhao, Xiaonan</au><au>Macedo, Ives</au><au>Appalaraju, Srikar</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning</atitle><btitle>2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</btitle><stitle>WACV</stitle><date>2022-01</date><risdate>2022</risdate><spage>587</spage><epage>596</epage><pages>587-596</pages><eissn>2642-9381</eissn><eisbn>1665409150</eisbn><eisbn>9781665409155</eisbn><coden>IEEPAD</coden><abstract>Recent advances in deep learning and computer vision have set new state of the art in logo recognition [2], [9], [36]. Logo recognition has mostly been approached as a closed-set object recognition problem and more recently as an open-set retrieval problem. Current approaches suffer from distinguishing visually similar logos, especially in open-set retrieval for very large-scale applications with thousands of brands. To address the problem, we propose a multi-task learning architecture of deep metric learning and scene text recognition. We use brand names as weak labels and enforce the model to simultaneously extract distinct visual features as well as predict brand name text. To achieve it, we collected a dataset with 3 Million logos cropped from Amazon Product Catalog images across nearly 8K brands, named PL8K. Our experiments show that adding the task of text recognition during training boosts the model's retrieval performance both on our PL8K dataset and on five other public logo datasets.</abstract><pub>IEEE</pub><doi>10.1109/WACV51458.2022.00066</doi><tpages>10</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2642-9381
ispartof 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, p.587-596
issn 2642-9381
language eng
recordid cdi_ieee_primary_9706752
source IEEE Xplore All Conference Series
subjects Computer vision
Feature extraction
Image/Video Indexing and Retrieval
Large-scale Vision Applications
Object Detection/Recognition/Categorization
Transfer
Few-shot
Semi- and Un- supervised Learning
Vision and Languages
Measurement
Predictive models
Text recognition
Training
Visualization
title SeeTek: Very Large-Scale Open-set Logo Recognition with Text-Aware Metric Learning
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T09%3A53%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=SeeTek:%20Very%20Large-Scale%20Open-set%20Logo%20Recognition%20with%20Text-Aware%20Metric%20Learning&rft.btitle=2022%20IEEE/CVF%20Winter%20Conference%20on%20Applications%20of%20Computer%20Vision%20(WACV)&rft.au=Li,%20Chenge&rft.date=2022-01&rft.spage=587&rft.epage=596&rft.pages=587-596&rft.eissn=2642-9381&rft.coden=IEEPAD&rft_id=info:doi/10.1109/WACV51458.2022.00066&rft.eisbn=1665409150&rft.eisbn_list=9781665409155&rft_dat=%3Cieee_CHZPO%3E9706752%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i203t-4e21f556a223c86bb6d0d07de014cfbe2de5edaac37cbda0e9cd34bd289961bb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9706752&rfr_iscdi=true