Loading…

[Formula Omitted]: Simple Joint 3-D Detection and Tracking With Three-Step Offset Learning

Light-detection-and-ranging-based multiobject detection and tracking play fundamental roles in autonomous driving systems. Most existing detection and tracking methods inevitably require complex pairing permutations for object association across frames, making the framework slow. Moreover, the occlu...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on industrial informatics 2024-01, Vol.20 (2), p.2284
Main Authors: Sun, Jing, Yi-Mu, Ji, He, Jing, Wu, Fei, Sun, Yanfei
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Light-detection-and-ranging-based multiobject detection and tracking play fundamental roles in autonomous driving systems. Most existing detection and tracking methods inevitably require complex pairing permutations for object association across frames, making the framework slow. Moreover, the occlusion and viewpoint changes lead to missed and false detection. To solve the abovementioned issues, this article proposes a simple joint 3-D detection and tracking approach with three-step offset learning ([Formula Omitted]). Specifically, [Formula Omitted] incorporates three task-specific output subnetworks to learn three offsets: 1) center offset, 2) motion offset, and 3) association offset. The learning of abovementioned offsets eliminates the complex bipartite matching processing. Specifically, the center offset guides the model to generate precise detections, whereas the motion offset transforms the track from the previous frame to the current frame, and the association offset minimizes the distance between detection and motion-updated track of the same object. Then, a simple read-off operation is conducted for data association on a hybrid-time centerness map, which represents the detections and offset-updated tracks. In addition, we design a detection-feature-enhanced module that captures the temporal coherence of the object motion and appearance information, avoiding the missed and false detection. Experiments on nuScenes have demonstrated the effectiveness of our [Formula Omitted] in terms of accuracy and speed compared with most 3-D detection and tracking methods.
ISSN:1551-3203
1941-0050
DOI:10.1109/TII.2023.3290184