Loading…

Visual-GPS: Ego-Downward and Ambient Video Based Person Location Association

In a crowded and cluttered environment, identifying a particular person is a challenging problem. Current identification approaches are not able to handle the dynamic environment. In this paper, we tackle the problem of identifying and tracking a person of interest in the crowded environment using e...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang, Liang, Jiang, Hao, Huo, Zhouyuan, Xiao, Jizhong
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In a crowded and cluttered environment, identifying a particular person is a challenging problem. Current identification approaches are not able to handle the dynamic environment. In this paper, we tackle the problem of identifying and tracking a person of interest in the crowded environment using egocentric and third person view videos. We propose a novel method (Visual-GPS) to identify, track, and localize the person, who is capturing the egocentric video, using joint analysis of imagery from both videos. The output of our method is the bounding box of the target person detected in each frame of the third person view and the 3D metric trajectory. At glance, the views of the two cameras are quite different. This paper illustrates an insight into how they are correlated. Our proposed method uses several difference clues. In addition to using RGB images, we take advantage of both the body motion and action features to correlate the two views. We can track and localize the person by finding the most "correlated" individual in the third view. Furthermore, the target person's 3D trajectory is recovered based on the mapping of the 2d-3D body joints. Our experiment confirms the effectiveness of ETVIT network and shows 18.32 % improvement in detection accuracy against the baseline methods.
ISSN:2160-7516
DOI:10.1109/CVPRW.2019.00050