Loading…
A Novel SO(3) Rotational Equivariant Masked Autoencoder for 3D Mesh Object Analysis
Equivariant networks have recently made significant strides in computer vision tasks related to robotic grasping, molecule generation, and 6D pose tracking. In this paper, we explore 3D mesh object analysis based on an equivariant masked autoencoder to reduce the model dependence on large datasets a...
Saved in:
Published in: | IEEE transactions on circuits and systems for video technology 2024-09, p.1-1 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Equivariant networks have recently made significant strides in computer vision tasks related to robotic grasping, molecule generation, and 6D pose tracking. In this paper, we explore 3D mesh object analysis based on an equivariant masked autoencoder to reduce the model dependence on large datasets and predict the pose transformation. We employ 3D reconstruction tasks under rotation and masking operations, such as segmentation tasks after rotation, as pretraining to enhance downstream task performance. To mitigate the computational complexity of the algorithm, we first utilize multiple non-overlapping 3D mesh patches with a fixed face size. We then design a rotation-equivariant self-attention mechanism to obtain advanced features. To improve the throughput of the encoder, we design a sparse token merging strategy. Our method achieves comparable performance on equivariant analysis tasks of mesh objects, such as 3D mesh pose transformation estimation, object classification and part segmentation on the ShapeNetCore16, Manifold40, COSEG-aliens, COSEG-vases and Human Body datasets. In the object classification task, we achieve superior performance even when only 10% of the original sample is used. We perform extensive ablation experiments to demonstrate the efficacy of critical design choices in our approach. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2024.3465041 |