Loading…

Learning Representative Deep Features for Image Set Analysis

This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can e...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on multimedia 2015-11, Vol.17 (11), p.1960-1968
Main Authors:	Wu, Zifeng, Huang, Yongzhen, Wang, Liang
Format:	Article
Language:	English
Subjects:	Album classification Classification Convolution Data models deep learning Feature extraction Fittings gait recognition Hidden Markov models Human performance image set Learning Multimedia Temporal logic Training Training data Videos
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003
cites	cdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003
container_end_page	1968
container_issue	11
container_start_page	1960
container_title	IEEE transactions on multimedia
container_volume	17
creator	Wu, Zifeng Huang, Yongzhen Wang, Liang
description	This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.
doi_str_mv	10.1109/TMM.2015.2477681
format	article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_miscellaneous_1778007760</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7254176</ieee_id><sourcerecordid>3856168831</sourcerecordid><originalsourceid>FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</originalsourceid><addsrcrecordid>eNpdkNFLwzAQh4MoOKfvgi8FX3zpzCVpsoIvYzoddAg6n0OaXUZH19akFfbf27Lhg093HN_vuPsIuQU6AaDp43q1mjAKyYQJpeQUzsgIUgExpUqd933CaJwyoJfkKoQdpSASqkbkKUPjq6LaRh_YeAxYtaYtfjB6RmyiBZq266eRq3203JstRp_YRrPKlIdQhGty4UwZ8OZUx-Rr8bKev8XZ--tyPstiyxW08UaJPOeI1pkkdQmXSgIytrHCWSYYWgSZ5k5RZCC4EFY6wXPDzQaYE5TyMXk47m18_d1haPW-CBbL0lRYd0GDUtPhTzmg9__QXd35_t6BYilPhRSip-iRsr4OwaPTjS_2xh80UD3o1L1OPejUJ5195O4YKRDxD1csEaAk_wX2jG9i</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1729394644</pqid></control><display><type>article</type><title>Learning Representative Deep Features for Image Set Analysis</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Wu, Zifeng ; Huang, Yongzhen ; Wang, Liang</creator><creatorcontrib>Wu, Zifeng ; Huang, Yongzhen ; Wang, Liang</creatorcontrib><description>This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2015.2477681</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Album classification ; Classification ; Convolution ; Data models ; deep learning ; Feature extraction ; Fittings ; gait recognition ; Hidden Markov models ; Human performance ; image set ; Learning ; Multimedia ; Temporal logic ; Training ; Training data ; Videos</subject><ispartof>IEEE transactions on multimedia, 2015-11, Vol.17 (11), p.1960-1968</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Nov 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</citedby><cites>FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7254176$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Wu, Zifeng</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><title>Learning Representative Deep Features for Image Set Analysis</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.</description><subject>Album classification</subject><subject>Classification</subject><subject>Convolution</subject><subject>Data models</subject><subject>deep learning</subject><subject>Feature extraction</subject><subject>Fittings</subject><subject>gait recognition</subject><subject>Hidden Markov models</subject><subject>Human performance</subject><subject>image set</subject><subject>Learning</subject><subject>Multimedia</subject><subject>Temporal logic</subject><subject>Training</subject><subject>Training data</subject><subject>Videos</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNpdkNFLwzAQh4MoOKfvgi8FX3zpzCVpsoIvYzoddAg6n0OaXUZH19akFfbf27Lhg093HN_vuPsIuQU6AaDp43q1mjAKyYQJpeQUzsgIUgExpUqd933CaJwyoJfkKoQdpSASqkbkKUPjq6LaRh_YeAxYtaYtfjB6RmyiBZq266eRq3203JstRp_YRrPKlIdQhGty4UwZ8OZUx-Rr8bKev8XZ--tyPstiyxW08UaJPOeI1pkkdQmXSgIytrHCWSYYWgSZ5k5RZCC4EFY6wXPDzQaYE5TyMXk47m18_d1haPW-CBbL0lRYd0GDUtPhTzmg9__QXd35_t6BYilPhRSip-iRsr4OwaPTjS_2xh80UD3o1L1OPejUJ5195O4YKRDxD1csEaAk_wX2jG9i</recordid><startdate>201511</startdate><enddate>201511</enddate><creator>Wu, Zifeng</creator><creator>Huang, Yongzhen</creator><creator>Wang, Liang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>201511</creationdate><title>Learning Representative Deep Features for Image Set Analysis</title><author>Wu, Zifeng ; Huang, Yongzhen ; Wang, Liang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Album classification</topic><topic>Classification</topic><topic>Convolution</topic><topic>Data models</topic><topic>deep learning</topic><topic>Feature extraction</topic><topic>Fittings</topic><topic>gait recognition</topic><topic>Hidden Markov models</topic><topic>Human performance</topic><topic>image set</topic><topic>Learning</topic><topic>Multimedia</topic><topic>Temporal logic</topic><topic>Training</topic><topic>Training data</topic><topic>Videos</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Zifeng</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Zifeng</au><au>Huang, Yongzhen</au><au>Wang, Liang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Representative Deep Features for Image Set Analysis</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2015-11</date><risdate>2015</risdate><volume>17</volume><issue>11</issue><spage>1960</spage><epage>1968</epage><pages>1960-1968</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2015.2477681</doi><tpages>9</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1520-9210
ispartof	IEEE transactions on multimedia, 2015-11, Vol.17 (11), p.1960-1968
issn	1520-9210 1941-0077
language	eng
recordid	cdi_proquest_miscellaneous_1778007760
source	IEEE Electronic Library (IEL) Journals
subjects	Album classification Classification Convolution Data models deep learning Feature extraction Fittings gait recognition Hidden Markov models Human performance image set Learning Multimedia Temporal logic Training Training data Videos
title	Learning Representative Deep Features for Image Set Analysis
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T20%3A34%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Representative%20Deep%20Features%20for%20Image%20Set%20Analysis&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Wu,%20Zifeng&rft.date=2015-11&rft.volume=17&rft.issue=11&rft.spage=1960&rft.epage=1968&rft.pages=1960-1968&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2015.2477681&rft_dat=%3Cproquest_ieee_%3E3856168831%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1729394644&rft_id=info:pmid/&rft_ieee_id=7254176&rfr_iscdi=true