Loading…

LO2net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision

Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) an...

Full description

Saved in:

Bibliographic Details
Published in:	Pattern analysis and applications : PAA 2023, Vol.26 (4), p.1671-1683
Main Authors:	Ruan, Tao, Wei, Shikui, Zhao, Yao, Guo, Baoqing, Yu, Zujun
Format:	Article
Language:	English
Subjects:	Computer Science Modules Pattern Recognition Semantics Theoretical Advances
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-p157t-476656f488dfcdde19fa03390f6c7dceb1e856e5ce481fab22160a2ef292d7593
container_end_page	1683
container_issue	4
container_start_page	1671
container_title	Pattern analysis and applications : PAA
container_volume	26
creator	Ruan, Tao Wei, Shikui Zhao, Yao Guo, Baoqing Yu, Zujun
description	Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) and the local semantics (e.g., boundary areas) can be utilized to better distinguish foreground objects from the complex background. In this paper, we investigate how to effectively leverage the above two kinds of semantics. For global semantics, two convolutional modules are designed to take advantage of data-level background priors and feature-level multi-scale characteristics, respectively; for local semantics, another module is further put forward to be aware of the semantic edges between foreground and background. The three modules are intertwined with each other, yielding a simple yet effective deep framework named g L O bal– L O cal Semantics Coupled Network ( L O 2 Net), which is end-to-end trainable in a scene-specific manner. Benefiting from the L O 2 Net, we achieve superior performance on multiple public datasets, with less supervision trained against several state-of-the-art methods.
doi_str_mv	10.1007/s10044-023-01193-5
format	article
fullrecord	<record><control><sourceid>proquest_sprin</sourceid><recordid>TN_cdi_proquest_journals_2892307373</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2892307373</sourcerecordid><originalsourceid>FETCH-LOGICAL-p157t-476656f488dfcdde19fa03390f6c7dceb1e856e5ce481fab22160a2ef292d7593</originalsourceid><addsrcrecordid>eNpFkM1OwzAMgCMEEmPwApwicQ7kp2labmiCgTSxAyBxq7LEGRmlKUm7ceQdeEOehMIQXGzL_mRbH0LHjJ4yStVZGmKWEcoFoYyVgsgdNGKZEERJ-bj7V2dsHx2ktKJUCMGLEepnc95Ad46ndVjo-vP9YxaMrvEdvOim8ybhSejbGiy-hW4T4jN2IeJkoAGSWjDeeYPX3kL4HsAyhr6xGN66qE3nQ4M3vnvCNaSEU99CXPs0dA_RntN1gqPfPEYPV5f3k2sym09vJhcz0jKpOpKpPJe5y4rCOmMtsNLp4fGSutwoa2DBoJA5SANZwZxecM5yqjk4XnKrZCnG6GS7t43htYfUVavQx2Y4WfGi5IIqocRAiS2V2uibJcR_itHq22-19VsNfqsfv5UUX3QjcR4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2892307373</pqid></control><display><type>article</type><title>LO2net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision</title><source>Springer Link</source><creator>Ruan, Tao ; Wei, Shikui ; Zhao, Yao ; Guo, Baoqing ; Yu, Zujun</creator><creatorcontrib>Ruan, Tao ; Wei, Shikui ; Zhao, Yao ; Guo, Baoqing ; Yu, Zujun</creatorcontrib><description>Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) and the local semantics (e.g., boundary areas) can be utilized to better distinguish foreground objects from the complex background. In this paper, we investigate how to effectively leverage the above two kinds of semantics. For global semantics, two convolutional modules are designed to take advantage of data-level background priors and feature-level multi-scale characteristics, respectively; for local semantics, another module is further put forward to be aware of the semantic edges between foreground and background. The three modules are intertwined with each other, yielding a simple yet effective deep framework named g L O bal– L O cal Semantics Coupled Network ( L O 2 Net), which is end-to-end trainable in a scene-specific manner. Benefiting from the L O 2 Net, we achieve superior performance on multiple public datasets, with less supervision trained against several state-of-the-art methods.</description><identifier>ISSN: 1433-7541</identifier><identifier>EISSN: 1433-755X</identifier><identifier>DOI: 10.1007/s10044-023-01193-5</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Computer Science ; Modules ; Pattern Recognition ; Semantics ; Theoretical Advances</subject><ispartof>Pattern analysis and applications : PAA, 2023, Vol.26 (4), p.1671-1683</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-p157t-476656f488dfcdde19fa03390f6c7dceb1e856e5ce481fab22160a2ef292d7593</cites><orcidid>0000-0002-0110-8107</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Ruan, Tao</creatorcontrib><creatorcontrib>Wei, Shikui</creatorcontrib><creatorcontrib>Zhao, Yao</creatorcontrib><creatorcontrib>Guo, Baoqing</creatorcontrib><creatorcontrib>Yu, Zujun</creatorcontrib><title>LO2net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision</title><title>Pattern analysis and applications : PAA</title><addtitle>Pattern Anal Applic</addtitle><description>Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) and the local semantics (e.g., boundary areas) can be utilized to better distinguish foreground objects from the complex background. In this paper, we investigate how to effectively leverage the above two kinds of semantics. For global semantics, two convolutional modules are designed to take advantage of data-level background priors and feature-level multi-scale characteristics, respectively; for local semantics, another module is further put forward to be aware of the semantic edges between foreground and background. The three modules are intertwined with each other, yielding a simple yet effective deep framework named g L O bal– L O cal Semantics Coupled Network ( L O 2 Net), which is end-to-end trainable in a scene-specific manner. Benefiting from the L O 2 Net, we achieve superior performance on multiple public datasets, with less supervision trained against several state-of-the-art methods.</description><subject>Computer Science</subject><subject>Modules</subject><subject>Pattern Recognition</subject><subject>Semantics</subject><subject>Theoretical Advances</subject><issn>1433-7541</issn><issn>1433-755X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid/><recordid>eNpFkM1OwzAMgCMEEmPwApwicQ7kp2labmiCgTSxAyBxq7LEGRmlKUm7ceQdeEOehMIQXGzL_mRbH0LHjJ4yStVZGmKWEcoFoYyVgsgdNGKZEERJ-bj7V2dsHx2ktKJUCMGLEepnc95Ad46ndVjo-vP9YxaMrvEdvOim8ybhSejbGiy-hW4T4jN2IeJkoAGSWjDeeYPX3kL4HsAyhr6xGN66qE3nQ4M3vnvCNaSEU99CXPs0dA_RntN1gqPfPEYPV5f3k2sym09vJhcz0jKpOpKpPJe5y4rCOmMtsNLp4fGSutwoa2DBoJA5SANZwZxecM5yqjk4XnKrZCnG6GS7t43htYfUVavQx2Y4WfGi5IIqocRAiS2V2uibJcR_itHq22-19VsNfqsfv5UUX3QjcR4</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Ruan, Tao</creator><creator>Wei, Shikui</creator><creator>Zhao, Yao</creator><creator>Guo, Baoqing</creator><creator>Yu, Zujun</creator><general>Springer London</general><general>Springer Nature B.V</general><scope/><orcidid>https://orcid.org/0000-0002-0110-8107</orcidid></search><sort><creationdate>2023</creationdate><title>LO2net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision</title><author>Ruan, Tao ; Wei, Shikui ; Zhao, Yao ; Guo, Baoqing ; Yu, Zujun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p157t-476656f488dfcdde19fa03390f6c7dceb1e856e5ce481fab22160a2ef292d7593</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science</topic><topic>Modules</topic><topic>Pattern Recognition</topic><topic>Semantics</topic><topic>Theoretical Advances</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ruan, Tao</creatorcontrib><creatorcontrib>Wei, Shikui</creatorcontrib><creatorcontrib>Zhao, Yao</creatorcontrib><creatorcontrib>Guo, Baoqing</creatorcontrib><creatorcontrib>Yu, Zujun</creatorcontrib><jtitle>Pattern analysis and applications : PAA</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ruan, Tao</au><au>Wei, Shikui</au><au>Zhao, Yao</au><au>Guo, Baoqing</au><au>Yu, Zujun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>LO2net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision</atitle><jtitle>Pattern analysis and applications : PAA</jtitle><stitle>Pattern Anal Applic</stitle><date>2023</date><risdate>2023</risdate><volume>26</volume><issue>4</issue><spage>1671</spage><epage>1683</epage><pages>1671-1683</pages><issn>1433-7541</issn><eissn>1433-755X</eissn><abstract>Video foreground extraction has been widely applied to quantitative fields and attracts great attention all over the world. Nevertheless, the performance of a such method can be easily reduced due to the dizzy environment. To tackle this problem, the global semantics (e.g., background statistics) and the local semantics (e.g., boundary areas) can be utilized to better distinguish foreground objects from the complex background. In this paper, we investigate how to effectively leverage the above two kinds of semantics. For global semantics, two convolutional modules are designed to take advantage of data-level background priors and feature-level multi-scale characteristics, respectively; for local semantics, another module is further put forward to be aware of the semantic edges between foreground and background. The three modules are intertwined with each other, yielding a simple yet effective deep framework named g L O bal– L O cal Semantics Coupled Network ( L O 2 Net), which is end-to-end trainable in a scene-specific manner. Benefiting from the L O 2 Net, we achieve superior performance on multiple public datasets, with less supervision trained against several state-of-the-art methods.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10044-023-01193-5</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-0110-8107</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1433-7541
ispartof	Pattern analysis and applications : PAA, 2023, Vol.26 (4), p.1671-1683
issn	1433-7541 1433-755X
language	eng
recordid	cdi_proquest_journals_2892307373
source	Springer Link
subjects	Computer Science Modules Pattern Recognition Semantics Theoretical Advances
title	LO2net: Global–Local Semantics Coupled Network for scene-specific video foreground extraction with less supervision
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T15%3A50%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_sprin&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=LO2net:%20Global%E2%80%93Local%20Semantics%20Coupled%20Network%20for%20scene-specific%20video%20foreground%20extraction%20with%20less%20supervision&rft.jtitle=Pattern%20analysis%20and%20applications%20:%20PAA&rft.au=Ruan,%20Tao&rft.date=2023&rft.volume=26&rft.issue=4&rft.spage=1671&rft.epage=1683&rft.pages=1671-1683&rft.issn=1433-7541&rft.eissn=1433-755X&rft_id=info:doi/10.1007/s10044-023-01193-5&rft_dat=%3Cproquest_sprin%3E2892307373%3C/proquest_sprin%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p157t-476656f488dfcdde19fa03390f6c7dceb1e856e5ce481fab22160a2ef292d7593%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2892307373&rft_id=info:pmid/&rfr_iscdi=true