基于红外与可见光的多相机、多目标跟踪

Multi-camera and Multi-target Tracking Based on Infrared and Visible Light

  • 摘要: 为提高低光照环境下多相机、多目标跟踪的鲁棒性,提出一种融合红外与可见光视频的多模态跟踪算法FMMT(fusion based multi-camera multi-target tracking)。该算法通过构建深度神经网络,自适应地融合来自可见光相机与红外相机的多模态特征,并利用全局关联Transformer法实现跨相机的目标关联。为验证算法,构建了首个多模态多相机多目标跟踪数据集M3Track,其中包含20个场景、10万对图像和112.9万个目标。实验结果表明,所提算法在M3Track数据集上的CVMA(跨视角匹配准确率)和CVIDF1(跨视角IDF1)分别达到61.7和70.3,显著优于对比方法,尤其在夜间场景性能提升显著。本文工作为复杂光照条件下的多相机、多目标跟踪提供了有效的解决方案。

     

    Abstract: To enhance the robustness of multi-camera multi-target tracking in low-light environments, a multi-modal tracking algorithm fusing infrared and visible light videos is proposed, named FMMT(fusion based multi-camera multitarget tracking). The algorithm employs a deep neural network to adaptively fuse multi-modal features from visible-light and infrared cameras, and utilizes a global association Transformer for cross-camera target association. For validation, the first multi-modal multi-camera multi-target tracking dataset named M3 Track is constructed, which contains 20 scenes, 100k image pairs, and 1.129 million targets. Experimental results show that the proposed algorithm achieves 61.7 CVMA(cross-view matching accuracy) and 70.3 CVIDF1(cross-view IDF1) on M3 Track dataset, significantly outperforming comparative methods, especially in nighttime scenarios. This work provides an effective solution for multi-camera multi-target tracking in complex lighting conditions.

     

/

返回文章
返回