Loading…

Semisupervised Feature Selection via Spline Regression for Video Semantic Recognition

To improve both the efficiency and accuracy of video semantic recognition, we can perform feature selection on the extracted video features to select a subset of features from the high-dimensional feature set for a compact and accurate video data representation. Provided the number of labeled videos...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transaction on neural networks and learning systems 2015-02, Vol.26 (2), p.252-264
Main Authors: Yahong Han, Yi Yang, Yan Yan, Zhigang Ma, Sebe, Nicu, Xiaofang Zhou
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To improve both the efficiency and accuracy of video semantic recognition, we can perform feature selection on the extracted video features to select a subset of features from the high-dimensional feature set for a compact and accurate video data representation. Provided the number of labeled videos is small, supervised feature selection could fail to identify the relevant features that are discriminative to target classes. In many applications, abundant unlabeled videos are easily accessible. This motivates us to develop semisupervised feature selection algorithms to better identify the relevant video features, which are discriminative to target classes by effectively exploiting the information underlying the huge amount of unlabeled video data. In this paper, we propose a framework of video semantic recognition by semisupervised feature selection via spline regression (S 2 FS 2 R). Two scatter matrices are combined to capture both the discriminative information and the local geometry structure of labeled and unlabeled training videos: A within-class scatter matrix encoding discriminative information of labeled training videos and a spline scatter output from a local spline regression encoding data distribution. An â„“ 2,1 -norm is imposed as a regularization term on the transformation matrix to ensure it is sparse in rows, making it particularly suitable for feature selection. To efficiently solve S 2 FS 2 R, we develop an iterative algorithm and prove its convergency. In the experiments, three typical tasks of video semantic recognition, such as video concept detection, video classification, and human action recognition, are used to demonstrate that the proposed S 2 FS 2 R achieves better performance compared with the state-of-the-art methods.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2014.2314123