基于深度卷积神经网络的语义地图构建

胡美玉; 张云洲; 秦操; 刘桐伯

doi:10.13973/j.cnki.robot.180406

基于深度卷积神经网络的语义地图构建

Semantic Map Construction Based on Deep Convolutional Neural Network

摘要

摘要: 本文将图像语义分割与即时定位与地图构建（SLAM）技术相结合构建环境的3维语义地图．输入的图像序列经过ORB-SLAM进行关键帧筛选．提出了一种基于DeepLab算法改进的图像语义分割方法．在卷积网络的最后一层后面引入上采样卷积层，改善双线性插值过于粗糙的问题．关键帧的深度图作为门控信号控制不同卷积操作的选择，从而在对远处的物体保持细节的同时对近处的物体保持较大视野．然后，对齐分割后的图像与深度图，利用相邻关键帧之间的空间对应关系构建3维稠密语义地图．实验结果表明，对于室内和室外场景，本文算法可以实现准确的语义分割，反投影到3维空间中也能形成效果良好的语义地图；与当前大多数基于DeepLab与反卷积算法的方法相比，本文算法可以得到更好的语义地图．

Abstract: Semantic segmentation of images is combined with simultaneous localization and mapping (SLAM) to create three-dimensional semantic map. Through ORB-SLAM, the input image sequences are screened to obtain key frames. Then, an improved semantic segmentation method based on DeepLab algorithm is proposed. The up-sampling convolutional network is added behind the last layer of original convolution network to improve the coarse sampling caused by bilinear interpolation. The depth of the key frame is used as gating signals to control the choice of different convolution operations, as a result, the small details are preserved for remote objects and larger receptive fields are preserved for near objects simultaneously. The segmented image is aligned with the depth map. Then, three-dimensional dense semantic map of the scene is formed by using the spatial correspondence between adjacent key frames. Experimental results show that the proposed algorithm, for indoor and outdoor scenes, can implement accurate semantic segmentation and create satisfactory semantic map by reverse projection in the three-dimensional space. Compared with existing methods based on DeepLab and deconvolution algorithms, the proposed algorithm can obtain better semantic map.

HTML全文

参考文献(33)

施引文献

资源附件(0)