Loading…
Lightweight LiDAR-Camera Alignment With Homogeneous Local-Global Aware Representation
In this paper, a novel LiDAR-Camera Alignment (LCA) method using homogeneous local-global spatial aware representation is proposed. Compared with the state-of-the-art methods (e.g., LCCNet), our proposition holds 2 main superiorities. First, homogeneous multi-modality representation learned with a u...
Saved in:
Published in: | IEEE transactions on intelligent transportation systems 2024-11, Vol.25 (11), p.15922-15933 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, a novel LiDAR-Camera Alignment (LCA) method using homogeneous local-global spatial aware representation is proposed. Compared with the state-of-the-art methods (e.g., LCCNet), our proposition holds 2 main superiorities. First, homogeneous multi-modality representation learned with a uniform CNN model is applied along the iterative prediction stages, instead of the state-of-the-art heterogeneous counterparts extracted from the separated modality-wise CNN models within each stage. In this way, the model size can be significantly decreased (e.g., 12.39M (ours) vs. 333.75M (LCCNet)). Meanwhile, within our proposition the interaction between LiDAR and camera data is built during feature learning to better exploit the descriptive clues, which has not been well concerned by the existing approaches. Secondly, we propose to equip the learned LCA representation with local-global spatial aware capacity via encoding CNN's local convolutional features with Transformer's non-local self-attention manner. Accordingly, the local fine details and global spatial context can be jointly captured by the encoded local features. And, they will be jointly used for LCA. On the other hand, the existing methods generally choose to reveal the global spatial property via intuitively concatenating the local features. Additionally at the initial LCA stage, LiDAR is roughly aligned with camera by our pre-alignment method, according to the point distribution characteristics of its 2D projection version with the initial extrinsic parameters. Although its structure is simple, it can essentially alleviate LCA's difficulty for the consequent stages. To better optimize LCA, a novel loss function that builds the correlation between translation and rotation loss items is also proposed. The experiments on KITTI data verifies the superiority of our proposition both on effectiveness and efficiency. The source code will be released at https://github.com/Zaf233/Light-weight-LCA upon acceptance. |
---|---|
ISSN: | 1524-9050 1558-0016 |
DOI: | 10.1109/TITS.2024.3409397 |