Abstract:
Aiming at the problem of low semantic segmentation accuracy due to the complex extraterrestrial environment and limited computing resources in deep space exploration activities, an RGB-D fusion based semantic segmentation algorithm based on neighborhood metric relations is proposed. The algorithm replaces traditional monocular camera data with multi-modal RGB-D information, constructs the basic network with the medium-term fusion framework, and additionally designs a neighborhood-metric-relations module to improve the performance. Specifically, the medium-term fusion network performs operations such as refining, fusion, and patching for original features of different scales to achieve effective complementation of cross-modal data and cross-level features. Furthermore, the neighborhood metric relationship is constructed by combining semantic feature maps and semantic tags without increasing the inference cost, and the correlation information between sample categories is mined from the global and local features to improve the performance of the segmentation network. Experiments are carried out on the indoor dataset NYUDv2 and the Mars simulation site dataset MARSv1, respectively, and the results show that the multi-modal RGB-D information and the neighborhood metric relations can significantly improve the accuracy of semantic segmentation.