Loading…

Trajectory aligned features for first person action recognition

Egocentric videos are characterized by their ability to have the first person view. With the popularity of Google Glass and GoPro, use of egocentric videos is on the rise. With the substantial increase in the number of egocentric videos, the value and utility of recognizing actions of the wearer in...

Full description

Saved in:

Bibliographic Details
Published in:	Pattern recognition 2017-02, Vol.62, p.45-55
Main Authors:	Singh, Suriya, Arora, Chetan, Jawahar, C.V.
Format:	Article
Language:	English
Subjects:	Action and activity recognition Egocentric vision Video indexing and analysis Video segmentation
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Egocentric videos are characterized by their ability to have the first person view. With the popularity of Google Glass and GoPro, use of egocentric videos is on the rise. With the substantial increase in the number of egocentric videos, the value and utility of recognizing actions of the wearer in such videos has also thus increased. Unstructured movement of the camera due to natural head motion of the wearer causes sharp changes in the visual field of the egocentric camera causing many standard third person action recognition techniques to perform poorly on such videos. Objects present in the scene and hand gestures of the wearer are the most important cues for first person action recognition but are difficult to segment and recognize in an egocentric video. We propose a novel representation of the first person actions derived from feature trajectories. The features are simple to compute using standard point tracking and do not assume segmentation of hand/objects or recognizing object or hand pose unlike in many previous approaches. We train a bag of words classifier with the proposed features and report a performance improvement of more than 11% on publicly available datasets. Although not designed for the particular case, we show that our technique can also recognize wearer's actions when hands or objects are not visible. •We propose a novel and simple representation of the first person actions.•The features are simple to compute feature trajectories.•Our approach does not assume hand or object segmentation and pose.•Our technique results in improvement of more than 11% on publicly available datasets.•Our method can recognize wearer's actions when hands and objects are not visible.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2016.07.031