张健, 陈烨恒, 朱世强, 李月华. 一种基于邻域度量关系的RGB-D融合语义分割算法[J]. 机器人, 2023, 45(2): 156-165. DOI: 10.13973/j.cnki.robot.210550
引用本文: 张健, 陈烨恒, 朱世强, 李月华. 一种基于邻域度量关系的RGB-D融合语义分割算法[J]. 机器人, 2023, 45(2): 156-165. DOI: 10.13973/j.cnki.robot.210550
ZHANG Jian, CHEN Yeheng, ZHU Shiqiang, LI Yuehua. An RGB-D Fusion Based Semantic Segmentation Algorithm Based on Neighborhood Metric Relations[J]. ROBOT, 2023, 45(2): 156-165. DOI: 10.13973/j.cnki.robot.210550
Citation: ZHANG Jian, CHEN Yeheng, ZHU Shiqiang, LI Yuehua. An RGB-D Fusion Based Semantic Segmentation Algorithm Based on Neighborhood Metric Relations[J]. ROBOT, 2023, 45(2): 156-165. DOI: 10.13973/j.cnki.robot.210550

一种基于邻域度量关系的RGB-D融合语义分割算法

An RGB-D Fusion Based Semantic Segmentation Algorithm Based on Neighborhood Metric Relations

  • 摘要: 针对深空探测活动中地外环境复杂和计算资源受限,导致语义分割精度较低的问题,提出了一种基于邻域度量关系的RGB-D融合语义分割算法。该算法采用多模态的RGB-D信息取代传统的单目相机数据,并以中期融合框架构建基础网络,且额外设计了邻域度量关系模块来优化表现。具体来说,中期融合网络针对不同尺度的原始特征执行精炼、融合、跳接等操作,实现跨模态数据以及跨层级特征的有效互补。进一步地,结合语义特征图与语义标签,以不增加网络推理开销的方法构建邻域度量关系,从全局及局部特征中挖掘样本类别之间的关联信息,提升分割网络的性能。分别在室内数据集NYUDv2和火星模拟场地数据集MARSv1上进行实验,结果表明多模态RGB-D信息以及邻域度量关系均能显著提升语义分割的精度。

     

    Abstract: Aiming at the problem of low semantic segmentation accuracy due to the complex extraterrestrial environment and limited computing resources in deep space exploration activities, an RGB-D fusion based semantic segmentation algorithm based on neighborhood metric relations is proposed. The algorithm replaces traditional monocular camera data with multi-modal RGB-D information, constructs the basic network with the medium-term fusion framework, and additionally designs a neighborhood-metric-relations module to improve the performance. Specifically, the medium-term fusion network performs operations such as refining, fusion, and patching for original features of different scales to achieve effective complementation of cross-modal data and cross-level features. Furthermore, the neighborhood metric relationship is constructed by combining semantic feature maps and semantic tags without increasing the inference cost, and the correlation information between sample categories is mined from the global and local features to improve the performance of the segmentation network. Experiments are carried out on the indoor dataset NYUDv2 and the Mars simulation site dataset MARSv1, respectively, and the results show that the multi-modal RGB-D information and the neighborhood metric relations can significantly improve the accuracy of semantic segmentation.

     

/

返回文章
返回