基于目标检测和场景流估计联合优化的3D多目标跟踪
3D Multi-object Tracking Based on Simultaneous Optimization of Object Detection and Scene Flow Estimation
-
摘要: 大多数3D多目标跟踪方法独立优化目标检测和帧间数据关联部分, 没有考虑单帧的特征学习和帧间关联学习的耦合性。为了实现单帧检测和帧间关联的耦合学习, 提出了一种基于目标检测和场景流估计联合优化的3D多目标跟踪框架FlowDet-Track。在该框架中, 提出了一个检测引导场景流估计模块来缓解不正确的帧间关联。为了获得更准确的场景流标签, 特别是在旋转运动的情况下, 提出了一种基于框变换的场景流真值计算方法。在KITTI MOT数据集上的实验结果表明, 本文算法的车辆类别HOTA与DetA指标比PointTrackNet算法提升了25.03%和30.8%, 表明本文算法的位置跟踪精度优异; 此外, 极端旋转运动条件下的对比实验进一步证明了算法的鲁棒性。Abstract: Most 3D multi-object tracking methods independently optimize target detection and inter-frame data association, without considering the coupling between single-frame feature learning and inter-frame association learning. To achieve the coupled learning of single-frame detection and inter-frame association, a 3D multi-object tracking framework is proposed based on the joint optimization of target detection and scene flow estimation, named FlowDet-Track. In this framework, a detection-guided scene flow estimation module is introduced to alleviate incorrect inter-frame association. To obtain more accurate scene flow labels, especially in cases of rotational motion, a box transformation-based ground truth calculation method is proposed for scene flow. The experimental results on the KITTI MOT dataset indicate that the HOTA and DetA metrics for vehicle category is improved by 25.03% and 30.8% compared with PointTrackNet algorithm, demonstrating the superior performance of the proposed method in position tracking. Moreover, comparative experiments under extreme rotational motion further validate the algorithm robustness.