6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning

This paper presents 6D vision transformer (6D-ViT), a transformer-based instance representation learning network suitable for highly accurate category-level object pose estimation based on RGB-D images. Specifically, a novel two-stream encoder-decoder framework is dedicated to exploring complex and...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on image processing 2022, Vol.31, p.1-1
Main Authors:	Zou, Lu, Huang, Zhangjin, Gu, Naijie, Wang, Guoping
Format:	Article
Language:	English
Subjects:	3D object detection 6D object pose estimation Coders Color imagery Datasets Encoders-Decoders Image reconstruction Multilayer perceptrons Point cloud compression Pose estimation Representation learning Shape Solid modeling Transformers vision transformer
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!