Loading…
Natural scene text localization and detection using MSER and its variants: a comprehensive survey
Text localization and detection within natural scene images have generated significant interest among researchers due to their inherent complexity and various real-life applications. In the last few decades, various methodologies have been developed for localization and detection of wild scene text...
Saved in:
Published in: | Multimedia tools and applications 2024-05, Vol.83 (18), p.55773-55810 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c270t-28287af7906431c2986ed9b41ba23a7d4c249f7e67fd3111831a87a12cd231e43 |
container_end_page | 55810 |
container_issue | 18 |
container_start_page | 55773 |
container_title | Multimedia tools and applications |
container_volume | 83 |
creator | Dutta, Kalpita Sarkhel, Ritesh Kundu, Mahantapas Nasipuri, Mita Das, Nibaran |
description | Text localization and detection within natural scene images have generated significant interest among researchers due to their inherent complexity and various real-life applications. In the last few decades, various methodologies have been developed for localization and detection of wild scene text regions. Among them, Maximally Stable Extremal Regions (MSER) based techniques have achieved remarkable success in a significant variety of text localization tasks over the last decade. MSER is a well-known blob detection method, which has been applied with some modifications in many scene text-related researches. In this paper, we have reviewed and evaluated the concept of MSER methods which are combined with traditional machine learning-based methods using hand-crafted features or deep learning-based methods using automatic feature learning for scene text localization. Different MSER methods, such as standard MSER, MSER with stroke width transform, eMSER, enhanced MSER, multi-level MSER, MSER with CNN features, component splitting with MSER tree, MSER with CNN and CRF, CE-MSER have been described in this study. Finally, we have compared and evaluated the performances of those different types of MSER methods on five publicly available standard scene text datasets, like ICDAR 2003, ICDAR 2013, ICDAR 2015, KAIST, and SVT and provided the insights of appropriate selection of MSER method along with its pros and cons. |
doi_str_mv | 10.1007/s11042-023-17671-1 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3055255019</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3055255019</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-28287af7906431c2986ed9b41ba23a7d4c249f7e67fd3111831a87a12cd231e43</originalsourceid><addsrcrecordid>eNp9kE1PwzAMhiMEEmPwBzhF4lyIk7RpuaFpfEgDJD7OUZa6o1OXjiSdGL-esiLBiZNt-X1s6SHkFNg5MKYuAgCTPGFcJKAyBQnskRGkSiRKcdj_0x-SoxCWjEGWcjki5sHEzpuGBosOacSPSJvWmqb-NLFuHTWupCVGtLupC7Vb0Pvn6dNuUcdAN8bXxsVwSQ217Wrt8Q1dqDdIQ-c3uD0mB5VpAp781DF5vZ6-TG6T2ePN3eRqlliuWEx4znNlKlWwTAqwvMgzLIu5hLnhwqhSWi6LSmGmqlIAQC7A9ABwW3IBKMWYnA1317597zBEvWw77_qXWrA05WnKoOhTfEhZ34bgsdJrX6-M32pg-lulHlTqXqXeqdTQQ2KAQh92C_S_p_-hvgCq_3Zp</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3055255019</pqid></control><display><type>article</type><title>Natural scene text localization and detection using MSER and its variants: a comprehensive survey</title><source>Springer Link</source><creator>Dutta, Kalpita ; Sarkhel, Ritesh ; Kundu, Mahantapas ; Nasipuri, Mita ; Das, Nibaran</creator><creatorcontrib>Dutta, Kalpita ; Sarkhel, Ritesh ; Kundu, Mahantapas ; Nasipuri, Mita ; Das, Nibaran</creatorcontrib><description>Text localization and detection within natural scene images have generated significant interest among researchers due to their inherent complexity and various real-life applications. In the last few decades, various methodologies have been developed for localization and detection of wild scene text regions. Among them, Maximally Stable Extremal Regions (MSER) based techniques have achieved remarkable success in a significant variety of text localization tasks over the last decade. MSER is a well-known blob detection method, which has been applied with some modifications in many scene text-related researches. In this paper, we have reviewed and evaluated the concept of MSER methods which are combined with traditional machine learning-based methods using hand-crafted features or deep learning-based methods using automatic feature learning for scene text localization. Different MSER methods, such as standard MSER, MSER with stroke width transform, eMSER, enhanced MSER, multi-level MSER, MSER with CNN features, component splitting with MSER tree, MSER with CNN and CRF, CE-MSER have been described in this study. Finally, we have compared and evaluated the performances of those different types of MSER methods on five publicly available standard scene text datasets, like ICDAR 2003, ICDAR 2013, ICDAR 2015, KAIST, and SVT and provided the insights of appropriate selection of MSER method along with its pros and cons.</description><identifier>ISSN: 1573-7721</identifier><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-023-17671-1</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Computer Communication Networks ; Computer Science ; Computer vision ; Data Structures and Information Theory ; Deep learning ; Image analysis ; Localization ; Machine learning ; Multimedia Information Systems ; Special Purpose and Application-Based Systems ; Track 6: Computer Vision for Multimedia Applications</subject><ispartof>Multimedia tools and applications, 2024-05, Vol.83 (18), p.55773-55810</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-28287af7906431c2986ed9b41ba23a7d4c249f7e67fd3111831a87a12cd231e43</cites><orcidid>0000-0002-2426-9915</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail></links><search><creatorcontrib>Dutta, Kalpita</creatorcontrib><creatorcontrib>Sarkhel, Ritesh</creatorcontrib><creatorcontrib>Kundu, Mahantapas</creatorcontrib><creatorcontrib>Nasipuri, Mita</creatorcontrib><creatorcontrib>Das, Nibaran</creatorcontrib><title>Natural scene text localization and detection using MSER and its variants: a comprehensive survey</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>Text localization and detection within natural scene images have generated significant interest among researchers due to their inherent complexity and various real-life applications. In the last few decades, various methodologies have been developed for localization and detection of wild scene text regions. Among them, Maximally Stable Extremal Regions (MSER) based techniques have achieved remarkable success in a significant variety of text localization tasks over the last decade. MSER is a well-known blob detection method, which has been applied with some modifications in many scene text-related researches. In this paper, we have reviewed and evaluated the concept of MSER methods which are combined with traditional machine learning-based methods using hand-crafted features or deep learning-based methods using automatic feature learning for scene text localization. Different MSER methods, such as standard MSER, MSER with stroke width transform, eMSER, enhanced MSER, multi-level MSER, MSER with CNN features, component splitting with MSER tree, MSER with CNN and CRF, CE-MSER have been described in this study. Finally, we have compared and evaluated the performances of those different types of MSER methods on five publicly available standard scene text datasets, like ICDAR 2003, ICDAR 2013, ICDAR 2015, KAIST, and SVT and provided the insights of appropriate selection of MSER method along with its pros and cons.</description><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Data Structures and Information Theory</subject><subject>Deep learning</subject><subject>Image analysis</subject><subject>Localization</subject><subject>Machine learning</subject><subject>Multimedia Information Systems</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Track 6: Computer Vision for Multimedia Applications</subject><issn>1573-7721</issn><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1PwzAMhiMEEmPwBzhF4lyIk7RpuaFpfEgDJD7OUZa6o1OXjiSdGL-esiLBiZNt-X1s6SHkFNg5MKYuAgCTPGFcJKAyBQnskRGkSiRKcdj_0x-SoxCWjEGWcjki5sHEzpuGBosOacSPSJvWmqb-NLFuHTWupCVGtLupC7Vb0Pvn6dNuUcdAN8bXxsVwSQ217Wrt8Q1dqDdIQ-c3uD0mB5VpAp781DF5vZ6-TG6T2ePN3eRqlliuWEx4znNlKlWwTAqwvMgzLIu5hLnhwqhSWi6LSmGmqlIAQC7A9ABwW3IBKMWYnA1317597zBEvWw77_qXWrA05WnKoOhTfEhZ34bgsdJrX6-M32pg-lulHlTqXqXeqdTQQ2KAQh92C_S_p_-hvgCq_3Zp</recordid><startdate>202405</startdate><enddate>202405</enddate><creator>Dutta, Kalpita</creator><creator>Sarkhel, Ritesh</creator><creator>Kundu, Mahantapas</creator><creator>Nasipuri, Mita</creator><creator>Das, Nibaran</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2426-9915</orcidid></search><sort><creationdate>202405</creationdate><title>Natural scene text localization and detection using MSER and its variants: a comprehensive survey</title><author>Dutta, Kalpita ; Sarkhel, Ritesh ; Kundu, Mahantapas ; Nasipuri, Mita ; Das, Nibaran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-28287af7906431c2986ed9b41ba23a7d4c249f7e67fd3111831a87a12cd231e43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Data Structures and Information Theory</topic><topic>Deep learning</topic><topic>Image analysis</topic><topic>Localization</topic><topic>Machine learning</topic><topic>Multimedia Information Systems</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Track 6: Computer Vision for Multimedia Applications</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dutta, Kalpita</creatorcontrib><creatorcontrib>Sarkhel, Ritesh</creatorcontrib><creatorcontrib>Kundu, Mahantapas</creatorcontrib><creatorcontrib>Nasipuri, Mita</creatorcontrib><creatorcontrib>Das, Nibaran</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dutta, Kalpita</au><au>Sarkhel, Ritesh</au><au>Kundu, Mahantapas</au><au>Nasipuri, Mita</au><au>Das, Nibaran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Natural scene text localization and detection using MSER and its variants: a comprehensive survey</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2024-05</date><risdate>2024</risdate><volume>83</volume><issue>18</issue><spage>55773</spage><epage>55810</epage><pages>55773-55810</pages><issn>1573-7721</issn><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>Text localization and detection within natural scene images have generated significant interest among researchers due to their inherent complexity and various real-life applications. In the last few decades, various methodologies have been developed for localization and detection of wild scene text regions. Among them, Maximally Stable Extremal Regions (MSER) based techniques have achieved remarkable success in a significant variety of text localization tasks over the last decade. MSER is a well-known blob detection method, which has been applied with some modifications in many scene text-related researches. In this paper, we have reviewed and evaluated the concept of MSER methods which are combined with traditional machine learning-based methods using hand-crafted features or deep learning-based methods using automatic feature learning for scene text localization. Different MSER methods, such as standard MSER, MSER with stroke width transform, eMSER, enhanced MSER, multi-level MSER, MSER with CNN features, component splitting with MSER tree, MSER with CNN and CRF, CE-MSER have been described in this study. Finally, we have compared and evaluated the performances of those different types of MSER methods on five publicly available standard scene text datasets, like ICDAR 2003, ICDAR 2013, ICDAR 2015, KAIST, and SVT and provided the insights of appropriate selection of MSER method along with its pros and cons.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-023-17671-1</doi><tpages>38</tpages><orcidid>https://orcid.org/0000-0002-2426-9915</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1573-7721 |
ispartof | Multimedia tools and applications, 2024-05, Vol.83 (18), p.55773-55810 |
issn | 1573-7721 1380-7501 1573-7721 |
language | eng |
recordid | cdi_proquest_journals_3055255019 |
source | Springer Link |
subjects | Computer Communication Networks Computer Science Computer vision Data Structures and Information Theory Deep learning Image analysis Localization Machine learning Multimedia Information Systems Special Purpose and Application-Based Systems Track 6: Computer Vision for Multimedia Applications |
title | Natural scene text localization and detection using MSER and its variants: a comprehensive survey |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-03-06T03%3A43%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Natural%20scene%20text%20localization%20and%20detection%20using%20MSER%20and%20its%20variants:%20a%20comprehensive%20survey&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Dutta,%20Kalpita&rft.date=2024-05&rft.volume=83&rft.issue=18&rft.spage=55773&rft.epage=55810&rft.pages=55773-55810&rft.issn=1573-7721&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-023-17671-1&rft_dat=%3Cproquest_cross%3E3055255019%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c270t-28287af7906431c2986ed9b41ba23a7d4c249f7e67fd3111831a87a12cd231e43%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3055255019&rft_id=info:pmid/&rfr_iscdi=true |