Loading…
Transformer-Based Self-Supervised Monocular Depth and Visual Odometry
Self-supervised monocular depth and visual odometry (VO) are often cast as coupled tasks. Accurate depth contributes to precise pose estimation and vice versa. Existing architectures typically exploit stacking convolution layers and long short-term memory (LSTM) units to capture long-range dependenc...
Saved in:
Published in: | IEEE sensors journal 2023-01, Vol.23 (2), p.1436-1446 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Self-supervised monocular depth and visual odometry (VO) are often cast as coupled tasks. Accurate depth contributes to precise pose estimation and vice versa. Existing architectures typically exploit stacking convolution layers and long short-term memory (LSTM) units to capture long-range dependencies. However, their intrinsic locality hinders the model from getting the expected performance gain. In this article, we propose a Transformer-based architecture, named Transformer-based self-supervised monocular depth and VO (TSSM-VO), to tackle these problems. It comprises two main components: 1) a depth generator that leverages the powerful capability of multihead self-attention (MHSA) on modeling long-range spatial dependencies and 2) a pose estimator built upon a Transformer to learn long-range temporal correlations of image sequences. Moreover, a new data augmentation loss based on structural similarity (SSIM) is introduced to constrain further the structural similarity between the augmented depth and the augmented predicted depth. Rigorous ablation studies and exhaustive performance comparison on the KITTI and Make3D datasets demonstrate the superiority of TSSM-VO over other self-supervised methods. We expect that TSSM-VO would enhance the ability of intelligent agents to understand the surrounding environments. |
---|---|
ISSN: | 1530-437X 1558-1748 |
DOI: | 10.1109/JSEN.2022.3227017 |