基于红外与可见光的多相机、多目标跟踪

赵庭辉; 范慧杰; 王强; 唐延东

doi:10.13973/j.cnki.robot.240265

基于红外与可见光的多相机、多目标跟踪

Multi-camera and Multi-target Tracking Based on Infrared and Visible Light

摘要

摘要: 为提高低光照环境下多相机、多目标跟踪的鲁棒性，提出一种融合红外与可见光视频的多模态跟踪算法FMMT(fusion based multi-camera multi-target tracking)。该算法通过构建深度神经网络，自适应地融合来自可见光相机与红外相机的多模态特征，并利用全局关联Transformer法实现跨相机的目标关联。为验证算法，构建了首个多模态多相机多目标跟踪数据集M3Track，其中包含20个场景、10万对图像和112.9万个目标。实验结果表明，所提算法在M3Track数据集上的CVMA（跨视角匹配准确率）和CVIDF1（跨视角IDF1）分别达到61.7和70.3，显著优于对比方法，尤其在夜间场景性能提升显著。本文工作为复杂光照条件下的多相机、多目标跟踪提供了有效的解决方案。

Abstract: To enhance the robustness of multi-camera multi-target tracking in low-light environments, a multi-modal tracking algorithm fusing infrared and visible light videos is proposed, named FMMT(fusion based multi-camera multitarget tracking). The algorithm employs a deep neural network to adaptively fuse multi-modal features from visible-light and infrared cameras, and utilizes a global association Transformer for cross-camera target association. For validation, the first multi-modal multi-camera multi-target tracking dataset named M3 Track is constructed, which contains 20 scenes, 100k image pairs, and 1.129 million targets. Experimental results show that the proposed algorithm achieves 61.7 CVMA(cross-view matching accuracy) and 70.3 CVIDF1(cross-view IDF1) on M3 Track dataset, significantly outperforming comparative methods, especially in nighttime scenarios. This work provides an effective solution for multi-camera multi-target tracking in complex lighting conditions.

HTML全文

参考文献(48)

施引文献

资源附件(0)