Loading…
Gaze Tracking in 3D Space with a Convolution Neural Network "See What I See"
This paper presents integrated architecture to estimate gaze vectors under unrestricted head motions. Since previous approaches focused on estimating gaze toward a small planar screen, calibration is needed prior to use. With a Kinect device, we develop a method that relies on depth sensing to obtai...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper presents integrated architecture to estimate gaze vectors under unrestricted head motions. Since previous approaches focused on estimating gaze toward a small planar screen, calibration is needed prior to use. With a Kinect device, we develop a method that relies on depth sensing to obtain robust and accurate head pose tracking and obtain the eye-in-head gaze direction information by training the visual data from eye images with a Neural Network (NN) model. Our model uses a Convolution Neural Network (CNN) that has five layers: two sets of convolution-pooling pairs and a fully connected-output layer. The filters are taken from the random patches of the images in an unsupervised way by k-means clustering. The learned filters are fed to a convolution layer, each of which is followed by a pooling layer, to reduce the resolution of the feature map and the sensitivity of the output to the shifts and the distortions. In the end, fully connected layers can be used as a classifier with a feed-forward-based process to obtain the weight. We reconstruct the gaze vectors from a set of head and eye pose orientations. The results of this approach suggest that the gaze estimation error is 5 degrees. This model is more accurate than a simple NN and an adaptive linear regression (ALR) approach. |
---|---|
ISSN: | 2332-5615 |
DOI: | 10.1109/AIPR.2017.8457962 |