Loading…

Learning Representative Deep Features for Image Set Analysis

This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can e...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on multimedia 2015-11, Vol.17 (11), p.1960-1968
Main Authors: Wu, Zifeng, Huang, Yongzhen, Wang, Liang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003
cites cdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003
container_end_page 1968
container_issue 11
container_start_page 1960
container_title IEEE transactions on multimedia
container_volume 17
creator Wu, Zifeng
Huang, Yongzhen
Wang, Liang
description This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.
doi_str_mv 10.1109/TMM.2015.2477681
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_miscellaneous_1778007760</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7254176</ieee_id><sourcerecordid>3856168831</sourcerecordid><originalsourceid>FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</originalsourceid><addsrcrecordid>eNpdkNFLwzAQh4MoOKfvgi8FX3zpzCVpsoIvYzoddAg6n0OaXUZH19akFfbf27Lhg093HN_vuPsIuQU6AaDp43q1mjAKyYQJpeQUzsgIUgExpUqd933CaJwyoJfkKoQdpSASqkbkKUPjq6LaRh_YeAxYtaYtfjB6RmyiBZq266eRq3203JstRp_YRrPKlIdQhGty4UwZ8OZUx-Rr8bKev8XZ--tyPstiyxW08UaJPOeI1pkkdQmXSgIytrHCWSYYWgSZ5k5RZCC4EFY6wXPDzQaYE5TyMXk47m18_d1haPW-CBbL0lRYd0GDUtPhTzmg9__QXd35_t6BYilPhRSip-iRsr4OwaPTjS_2xh80UD3o1L1OPejUJ5195O4YKRDxD1csEaAk_wX2jG9i</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1729394644</pqid></control><display><type>article</type><title>Learning Representative Deep Features for Image Set Analysis</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Wu, Zifeng ; Huang, Yongzhen ; Wang, Liang</creator><creatorcontrib>Wu, Zifeng ; Huang, Yongzhen ; Wang, Liang</creatorcontrib><description>This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2015.2477681</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Album classification ; Classification ; Convolution ; Data models ; deep learning ; Feature extraction ; Fittings ; gait recognition ; Hidden Markov models ; Human performance ; image set ; Learning ; Multimedia ; Temporal logic ; Training ; Training data ; Videos</subject><ispartof>IEEE transactions on multimedia, 2015-11, Vol.17 (11), p.1960-1968</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Nov 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</citedby><cites>FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7254176$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Wu, Zifeng</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><title>Learning Representative Deep Features for Image Set Analysis</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.</description><subject>Album classification</subject><subject>Classification</subject><subject>Convolution</subject><subject>Data models</subject><subject>deep learning</subject><subject>Feature extraction</subject><subject>Fittings</subject><subject>gait recognition</subject><subject>Hidden Markov models</subject><subject>Human performance</subject><subject>image set</subject><subject>Learning</subject><subject>Multimedia</subject><subject>Temporal logic</subject><subject>Training</subject><subject>Training data</subject><subject>Videos</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNpdkNFLwzAQh4MoOKfvgi8FX3zpzCVpsoIvYzoddAg6n0OaXUZH19akFfbf27Lhg093HN_vuPsIuQU6AaDp43q1mjAKyYQJpeQUzsgIUgExpUqd933CaJwyoJfkKoQdpSASqkbkKUPjq6LaRh_YeAxYtaYtfjB6RmyiBZq266eRq3203JstRp_YRrPKlIdQhGty4UwZ8OZUx-Rr8bKev8XZ--tyPstiyxW08UaJPOeI1pkkdQmXSgIytrHCWSYYWgSZ5k5RZCC4EFY6wXPDzQaYE5TyMXk47m18_d1haPW-CBbL0lRYd0GDUtPhTzmg9__QXd35_t6BYilPhRSip-iRsr4OwaPTjS_2xh80UD3o1L1OPejUJ5195O4YKRDxD1csEaAk_wX2jG9i</recordid><startdate>201511</startdate><enddate>201511</enddate><creator>Wu, Zifeng</creator><creator>Huang, Yongzhen</creator><creator>Wang, Liang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>201511</creationdate><title>Learning Representative Deep Features for Image Set Analysis</title><author>Wu, Zifeng ; Huang, Yongzhen ; Wang, Liang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Album classification</topic><topic>Classification</topic><topic>Convolution</topic><topic>Data models</topic><topic>deep learning</topic><topic>Feature extraction</topic><topic>Fittings</topic><topic>gait recognition</topic><topic>Hidden Markov models</topic><topic>Human performance</topic><topic>image set</topic><topic>Learning</topic><topic>Multimedia</topic><topic>Temporal logic</topic><topic>Training</topic><topic>Training data</topic><topic>Videos</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Zifeng</creatorcontrib><creatorcontrib>Huang, Yongzhen</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Zifeng</au><au>Huang, Yongzhen</au><au>Wang, Liang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Representative Deep Features for Image Set Analysis</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2015-11</date><risdate>2015</risdate><volume>17</volume><issue>11</issue><spage>1960</spage><epage>1968</epage><pages>1960-1968</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>This paper proposes to learn features from sets of labeled raw images. With this method, the problem of over-fitting can be effectively suppressed, so that deep CNNs can be trained from scratch with a small number of training data, i.e., 420 labeled albums with about 30 000 photos. This method can effectively deal with sets of images, no matter if the sets bear temporal structures. A typical approach to sequential image analysis usually leverages motions between adjacent frames, while the proposed method focuses on capturing the co-occurrences and frequencies of features. Nevertheless, our method outperforms previous best performers in terms of album classification, and achieves comparable or even better performances in terms of gait based human identification. These results demonstrate its effectiveness and good adaptivity to different kinds of set data.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TMM.2015.2477681</doi><tpages>9</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1520-9210
ispartof IEEE transactions on multimedia, 2015-11, Vol.17 (11), p.1960-1968
issn 1520-9210
1941-0077
language eng
recordid cdi_proquest_miscellaneous_1778007760
source IEEE Electronic Library (IEL) Journals
subjects Album classification
Classification
Convolution
Data models
deep learning
Feature extraction
Fittings
gait recognition
Hidden Markov models
Human performance
image set
Learning
Multimedia
Temporal logic
Training
Training data
Videos
title Learning Representative Deep Features for Image Set Analysis
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T20%3A34%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Representative%20Deep%20Features%20for%20Image%20Set%20Analysis&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Wu,%20Zifeng&rft.date=2015-11&rft.volume=17&rft.issue=11&rft.spage=1960&rft.epage=1968&rft.pages=1960-1968&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2015.2477681&rft_dat=%3Cproquest_ieee_%3E3856168831%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c371t-d74bb3eecfa59f536761e22dc4fc242ece169bf70e214344c6f43ba3ad12f4003%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1729394644&rft_id=info:pmid/&rft_ieee_id=7254176&rfr_iscdi=true