Loading…
Latent semantic learning with time-series cross correlation analysis for video scene detection and classification
This paper presents a novel, latent semantic learning method based on the proposed time-series cross correlation analysis for extracting a discriminative dynamic scene model to address the recognition problems of video event recognition and 3D human body gesture. Typical dynamic texture analysis pos...
Saved in:
Published in: | Multimedia tools and applications 2016-10, Vol.75 (20), p.12919-12940 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c349t-defe8e7c3e19d0b97715dfd6da2d9f71158f5bfbcb99fc01117f4f16e0f6b1123 |
---|---|
cites | cdi_FETCH-LOGICAL-c349t-defe8e7c3e19d0b97715dfd6da2d9f71158f5bfbcb99fc01117f4f16e0f6b1123 |
container_end_page | 12940 |
container_issue | 20 |
container_start_page | 12919 |
container_title | Multimedia tools and applications |
container_volume | 75 |
creator | Cheng, Shyi-Chyi Su, Jui-Yuan Hsiao, Kuei-Fang Rashvand, Habib F. |
description | This paper presents a novel, latent semantic learning method based on the proposed time-series cross correlation analysis for extracting a discriminative dynamic scene model to address the recognition problems of video event recognition and 3D human body gesture. Typical dynamic texture analysis poses the problems of modeling, learning, recognizing and synthesizing the images of dynamic scenes based on the autoregressive moving average (ARMA) model. Instead of applying the ARMA approach to capture the temporal structure of video sequences, this algorithm uses the learned dynamic scene model to semantically transform video sequences into multiple scenes with a lower computational effort. Therefore, to generate a discriminative dynamic scene model with space-time information preserved is crucial for the success of the proposed latent semantic learning. To achieve the goal, the k-medoids clustering with appearance distance metrics first used to partition all frames of training video sequences, regardless of their scene types, to provide an initial key-frame codebook. To discover the temporal structure of the dynamic scene model, we develop a time-series cross correlation analysis (TSCCA) to the latent semantic learning, with an alternating dynamic programing (ADP) to embed the time relationship between the training images into the dynamic scene model. We also tackle the problem of dynamic programming, which is supposed to produce large temporal misalignment for periodic activities. Moreover, the discriminative power of the model is estimated by a deterministic projection-based learning algorithm. Finally, based on the learned dynamic scene model, this paper uses a support vector machine (SVM) with a two-channel string kernel for video scene classification. Two test datasets, one for video event classification and the other for 3D human body gesture recognition, are used to verify the effectiveness of the proposed approach. Experimental results demonstrate that the proposed algorithm obtains good performance in terms of classification accuracy. |
doi_str_mv | 10.1007/s11042-015-2548-y |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1845797364</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>4313398921</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-defe8e7c3e19d0b97715dfd6da2d9f71158f5bfbcb99fc01117f4f16e0f6b1123</originalsourceid><addsrcrecordid>eNp1kT1LBDEQhhdR8PMH2AVsbKKZ3c1mU4r4BQc2WodsMjlz7GU1ySn3783dWYhgMzPF877MzFtV58CugDFxnQBYW1MGnNa87el6rzoCLhoqRA37ZW56RgVncFgdp7RgDDpet0fVx0xnDJkkXOqQvSEj6hh8mJMvn99I9kukCaPHREycUqlTjDjq7KdAdNDjOvlE3BTJp7c4kWQwILGY0fwglphRp-SdN1vVaXXg9Jjw7KefVK_3dy-3j3T2_PB0ezOjpmllphYd9ihMgyAtG6QQwK2zndW1lU4A8N7xwQ1mkNIZBgDCtQ46ZK4bAOrmpLrc-b7H6WOFKaulL9uNow44rZKCvuVCiqZrC3rxB11Mq1iO21CdkA2IVhYKdtT2ERGdeo9-qeNaAVObENQuBFVCUJsQ1Lpo6p0mFTbMMf5y_lf0DWqgjSs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1867931749</pqid></control><display><type>article</type><title>Latent semantic learning with time-series cross correlation analysis for video scene detection and classification</title><source>ABI/INFORM global</source><source>Springer Link</source><creator>Cheng, Shyi-Chyi ; Su, Jui-Yuan ; Hsiao, Kuei-Fang ; Rashvand, Habib F.</creator><creatorcontrib>Cheng, Shyi-Chyi ; Su, Jui-Yuan ; Hsiao, Kuei-Fang ; Rashvand, Habib F.</creatorcontrib><description>This paper presents a novel, latent semantic learning method based on the proposed time-series cross correlation analysis for extracting a discriminative dynamic scene model to address the recognition problems of video event recognition and 3D human body gesture. Typical dynamic texture analysis poses the problems of modeling, learning, recognizing and synthesizing the images of dynamic scenes based on the autoregressive moving average (ARMA) model. Instead of applying the ARMA approach to capture the temporal structure of video sequences, this algorithm uses the learned dynamic scene model to semantically transform video sequences into multiple scenes with a lower computational effort. Therefore, to generate a discriminative dynamic scene model with space-time information preserved is crucial for the success of the proposed latent semantic learning. To achieve the goal, the k-medoids clustering with appearance distance metrics first used to partition all frames of training video sequences, regardless of their scene types, to provide an initial key-frame codebook. To discover the temporal structure of the dynamic scene model, we develop a time-series cross correlation analysis (TSCCA) to the latent semantic learning, with an alternating dynamic programing (ADP) to embed the time relationship between the training images into the dynamic scene model. We also tackle the problem of dynamic programming, which is supposed to produce large temporal misalignment for periodic activities. Moreover, the discriminative power of the model is estimated by a deterministic projection-based learning algorithm. Finally, based on the learned dynamic scene model, this paper uses a support vector machine (SVM) with a two-channel string kernel for video scene classification. Two test datasets, one for video event classification and the other for 3D human body gesture recognition, are used to verify the effectiveness of the proposed approach. Experimental results demonstrate that the proposed algorithm obtains good performance in terms of classification accuracy.</description><identifier>ISSN: 1380-7501</identifier><identifier>EISSN: 1573-7721</identifier><identifier>DOI: 10.1007/s11042-015-2548-y</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Algorithms ; Classification ; Computer Communication Networks ; Computer Science ; Computer vision ; Correlation analysis ; Data Structures and Information Theory ; Dynamic programming ; Dynamic structural analysis ; Dynamics ; Learning ; Machine learning ; Multimedia Information Systems ; Recognition ; Semantics ; Spacetime ; Special Purpose and Application-Based Systems ; Surveillance ; Texture</subject><ispartof>Multimedia tools and applications, 2016-10, Vol.75 (20), p.12919-12940</ispartof><rights>Springer Science+Business Media New York 2015</rights><rights>Multimedia Tools and Applications is a copyright of Springer, 2016.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-defe8e7c3e19d0b97715dfd6da2d9f71158f5bfbcb99fc01117f4f16e0f6b1123</citedby><cites>FETCH-LOGICAL-c349t-defe8e7c3e19d0b97715dfd6da2d9f71158f5bfbcb99fc01117f4f16e0f6b1123</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1867931749/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1867931749?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,11667,27901,27902,36037,36038,44339,74638</link.rule.ids></links><search><creatorcontrib>Cheng, Shyi-Chyi</creatorcontrib><creatorcontrib>Su, Jui-Yuan</creatorcontrib><creatorcontrib>Hsiao, Kuei-Fang</creatorcontrib><creatorcontrib>Rashvand, Habib F.</creatorcontrib><title>Latent semantic learning with time-series cross correlation analysis for video scene detection and classification</title><title>Multimedia tools and applications</title><addtitle>Multimed Tools Appl</addtitle><description>This paper presents a novel, latent semantic learning method based on the proposed time-series cross correlation analysis for extracting a discriminative dynamic scene model to address the recognition problems of video event recognition and 3D human body gesture. Typical dynamic texture analysis poses the problems of modeling, learning, recognizing and synthesizing the images of dynamic scenes based on the autoregressive moving average (ARMA) model. Instead of applying the ARMA approach to capture the temporal structure of video sequences, this algorithm uses the learned dynamic scene model to semantically transform video sequences into multiple scenes with a lower computational effort. Therefore, to generate a discriminative dynamic scene model with space-time information preserved is crucial for the success of the proposed latent semantic learning. To achieve the goal, the k-medoids clustering with appearance distance metrics first used to partition all frames of training video sequences, regardless of their scene types, to provide an initial key-frame codebook. To discover the temporal structure of the dynamic scene model, we develop a time-series cross correlation analysis (TSCCA) to the latent semantic learning, with an alternating dynamic programing (ADP) to embed the time relationship between the training images into the dynamic scene model. We also tackle the problem of dynamic programming, which is supposed to produce large temporal misalignment for periodic activities. Moreover, the discriminative power of the model is estimated by a deterministic projection-based learning algorithm. Finally, based on the learned dynamic scene model, this paper uses a support vector machine (SVM) with a two-channel string kernel for video scene classification. Two test datasets, one for video event classification and the other for 3D human body gesture recognition, are used to verify the effectiveness of the proposed approach. Experimental results demonstrate that the proposed algorithm obtains good performance in terms of classification accuracy.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Classification</subject><subject>Computer Communication Networks</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Correlation analysis</subject><subject>Data Structures and Information Theory</subject><subject>Dynamic programming</subject><subject>Dynamic structural analysis</subject><subject>Dynamics</subject><subject>Learning</subject><subject>Machine learning</subject><subject>Multimedia Information Systems</subject><subject>Recognition</subject><subject>Semantics</subject><subject>Spacetime</subject><subject>Special Purpose and Application-Based Systems</subject><subject>Surveillance</subject><subject>Texture</subject><issn>1380-7501</issn><issn>1573-7721</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp1kT1LBDEQhhdR8PMH2AVsbKKZ3c1mU4r4BQc2WodsMjlz7GU1ySn3783dWYhgMzPF877MzFtV58CugDFxnQBYW1MGnNa87el6rzoCLhoqRA37ZW56RgVncFgdp7RgDDpet0fVx0xnDJkkXOqQvSEj6hh8mJMvn99I9kukCaPHREycUqlTjDjq7KdAdNDjOvlE3BTJp7c4kWQwILGY0fwglphRp-SdN1vVaXXg9Jjw7KefVK_3dy-3j3T2_PB0ezOjpmllphYd9ihMgyAtG6QQwK2zndW1lU4A8N7xwQ1mkNIZBgDCtQ46ZK4bAOrmpLrc-b7H6WOFKaulL9uNow44rZKCvuVCiqZrC3rxB11Mq1iO21CdkA2IVhYKdtT2ERGdeo9-qeNaAVObENQuBFVCUJsQ1Lpo6p0mFTbMMf5y_lf0DWqgjSs</recordid><startdate>20161001</startdate><enddate>20161001</enddate><creator>Cheng, Shyi-Chyi</creator><creator>Su, Jui-Yuan</creator><creator>Hsiao, Kuei-Fang</creator><creator>Rashvand, Habib F.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope></search><sort><creationdate>20161001</creationdate><title>Latent semantic learning with time-series cross correlation analysis for video scene detection and classification</title><author>Cheng, Shyi-Chyi ; Su, Jui-Yuan ; Hsiao, Kuei-Fang ; Rashvand, Habib F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-defe8e7c3e19d0b97715dfd6da2d9f71158f5bfbcb99fc01117f4f16e0f6b1123</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Classification</topic><topic>Computer Communication Networks</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Correlation analysis</topic><topic>Data Structures and Information Theory</topic><topic>Dynamic programming</topic><topic>Dynamic structural analysis</topic><topic>Dynamics</topic><topic>Learning</topic><topic>Machine learning</topic><topic>Multimedia Information Systems</topic><topic>Recognition</topic><topic>Semantics</topic><topic>Spacetime</topic><topic>Special Purpose and Application-Based Systems</topic><topic>Surveillance</topic><topic>Texture</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cheng, Shyi-Chyi</creatorcontrib><creatorcontrib>Su, Jui-Yuan</creatorcontrib><creatorcontrib>Hsiao, Kuei-Fang</creatorcontrib><creatorcontrib>Rashvand, Habib F.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Database (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM global</collection><collection>Computing Database</collection><collection>ProQuest Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Multimedia tools and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cheng, Shyi-Chyi</au><au>Su, Jui-Yuan</au><au>Hsiao, Kuei-Fang</au><au>Rashvand, Habib F.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Latent semantic learning with time-series cross correlation analysis for video scene detection and classification</atitle><jtitle>Multimedia tools and applications</jtitle><stitle>Multimed Tools Appl</stitle><date>2016-10-01</date><risdate>2016</risdate><volume>75</volume><issue>20</issue><spage>12919</spage><epage>12940</epage><pages>12919-12940</pages><issn>1380-7501</issn><eissn>1573-7721</eissn><abstract>This paper presents a novel, latent semantic learning method based on the proposed time-series cross correlation analysis for extracting a discriminative dynamic scene model to address the recognition problems of video event recognition and 3D human body gesture. Typical dynamic texture analysis poses the problems of modeling, learning, recognizing and synthesizing the images of dynamic scenes based on the autoregressive moving average (ARMA) model. Instead of applying the ARMA approach to capture the temporal structure of video sequences, this algorithm uses the learned dynamic scene model to semantically transform video sequences into multiple scenes with a lower computational effort. Therefore, to generate a discriminative dynamic scene model with space-time information preserved is crucial for the success of the proposed latent semantic learning. To achieve the goal, the k-medoids clustering with appearance distance metrics first used to partition all frames of training video sequences, regardless of their scene types, to provide an initial key-frame codebook. To discover the temporal structure of the dynamic scene model, we develop a time-series cross correlation analysis (TSCCA) to the latent semantic learning, with an alternating dynamic programing (ADP) to embed the time relationship between the training images into the dynamic scene model. We also tackle the problem of dynamic programming, which is supposed to produce large temporal misalignment for periodic activities. Moreover, the discriminative power of the model is estimated by a deterministic projection-based learning algorithm. Finally, based on the learned dynamic scene model, this paper uses a support vector machine (SVM) with a two-channel string kernel for video scene classification. Two test datasets, one for video event classification and the other for 3D human body gesture recognition, are used to verify the effectiveness of the proposed approach. Experimental results demonstrate that the proposed algorithm obtains good performance in terms of classification accuracy.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11042-015-2548-y</doi><tpages>22</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1380-7501 |
ispartof | Multimedia tools and applications, 2016-10, Vol.75 (20), p.12919-12940 |
issn | 1380-7501 1573-7721 |
language | eng |
recordid | cdi_proquest_miscellaneous_1845797364 |
source | ABI/INFORM global; Springer Link |
subjects | Accuracy Algorithms Classification Computer Communication Networks Computer Science Computer vision Correlation analysis Data Structures and Information Theory Dynamic programming Dynamic structural analysis Dynamics Learning Machine learning Multimedia Information Systems Recognition Semantics Spacetime Special Purpose and Application-Based Systems Surveillance Texture |
title | Latent semantic learning with time-series cross correlation analysis for video scene detection and classification |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T21%3A37%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Latent%20semantic%20learning%20with%20time-series%20cross%20correlation%20analysis%20for%20video%20scene%20detection%20and%20classification&rft.jtitle=Multimedia%20tools%20and%20applications&rft.au=Cheng,%20Shyi-Chyi&rft.date=2016-10-01&rft.volume=75&rft.issue=20&rft.spage=12919&rft.epage=12940&rft.pages=12919-12940&rft.issn=1380-7501&rft.eissn=1573-7721&rft_id=info:doi/10.1007/s11042-015-2548-y&rft_dat=%3Cproquest_cross%3E4313398921%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c349t-defe8e7c3e19d0b97715dfd6da2d9f71158f5bfbcb99fc01117f4f16e0f6b1123%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1867931749&rft_id=info:pmid/&rfr_iscdi=true |