RGB-D SLAM Method Based on Enhanced Segmentation in Dynamic Environment
WANG Hao1,2, LU Dejiu1,2, FANG Baofu1,2
1. School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China; 2. Key Laboratory of Knowledge Engineering with Big Data(Hefei University of Technology), Ministry of Education, Hefei 230009, China
Abstract:The current visual SLAM (simultaneous localization and mapping) method is prone to the problem of missed removal of dynamic objects in dynamic environment, which affects the accuracy of camera pose estimation and the availability of the map. So, an RGB-D SLAM method based on enhanced segmentation is proposed. Firstly, the results of instance segmentation network and depth image clustering are combined to verify whether the current frame misses segmentation. The segmentation results are repaired according to the multi-frame information if incomplete segmentation occurs. At the same time, the Shi-Tomasi corner points extracted from the current frame are screened out through the symmetrical transfer error to select the dynamic corner point sets. Then the motion state of each instance object in the scene is determined based on the repaired instance segmentation results. Finally, static feature points are used to track the camera pose and construct the instance-level semantic octree map. The proposed method is evaluated on the TUM public dataset and in real scenes. Compared with the current advanced visual SLAM methods, the proposed method can achieve better camera pose estimation accuracy and map construction effect, showing stronger robustness.
[1] Jia Y J, Yan X Y, Xu Y H.A survey of simultaneous localization and mapping for robot[C]//IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference.Piscataway, USA:IEEE, 2019:857-861. [2] Makhubela J K, Zuva T, Agunbiade O Y.A review on vision simultaneous localization and mapping (VSLAM)[C]//International Conference on Intelligent and Innovative Computing Applications.Piscataway, USA:IEEE, 2018.DOI:10.1109/ICONIC.2018.8601227. [3] Ruzicka M, Masek P.Design of visual odometry system for mobile robot[C]//16th International Conference on Mechatronics.Piscataway, USA:IEEE, 2014:548-553. [4] Mur-Artal R, Tardos J D.ORB-SLAM2:An open-source SLAM system for monocular, stereo, and RGB-D cameras[J].IEEE Transactions on Robotics, 2017, 33(5):1255-1262. [5] Engel J, Koltun V, Cremers D.Direct sparse odometry[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3):611-625. [6] 艾青林,刘刚江,徐巧宁.动态环境下基于改进几何与运动约束的机器人RGB-D SLAM算法[J].机器人, 2021, 43(2):167-176.Ai Q L, Liu G J, Xu Q N.An RGB-D SLAM algorithm for robot based on improved geometrical and kinematic constraints in dynamic environment[J].Robot, 2021, 43(2):167-176. [7] Kaneko M, Iwami K, Ogawa T, et al.Mask-SLAM:Robust feature-based monocular SLAM by masking using semantic segmentation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.Piscataway, USA:IEEE, 2018:371-378. [8] 丁文东,徐德,刘希龙,等.移动机器人视觉里程计综述[J].自动化学报, 2018, 44(3):385-400.Ding W D, Xu D, Liu X L, et al.Review on visual odometry for mobile robots[J].Acta Automatica Sinica, 2018, 44(3):385-400. [9] Klein G, Murray D.Parallel tracking and mapping for small AR workspaces[C]//6th IEEE and ACM International Symposium on Mixed and Augmented Reality.Piscataway, USA:IEEE, 2007:225-234. [10] Mur-Artal R, Montiel J M M, Tardos J D.ORB-SLAM:A versatile and accurate monocular SLAM system[J].IEEE Transactions on Robotics, 2015, 31(5):1147-1163. [11] Engel J, Schops T, Cremers D.LSD-SLAM:Large-scale direct monocular SLAM[M]//Lecture Notes in Computer Science, Vol.8690.Berlin, Germany:Springer, 2014:834-849. [12] Liu W, Anguelov D, Erhan D, et al.SSD:Single shot multibox detector[M]//Lecture Notes in Computer Science, Vol.9905.Berlin, Germany:Springer, 2016:21-37. [13] Shelhamer E, Long J, Darrell T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651. [14] 赵洋,刘国良,田国会,等.基于深度学习的视觉SLAM综述[J].机器人, 2017, 39(6):889-896.Zhao Y, Liu G L, Tian G H, et al.A survey of visual SLAM based on deep learning[J].Robot, 2017, 39(6):889-896. [15] Chen L C, Papandreou G, Kokkinos I, et al.DeepLab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4):834-848. [16] Bescos B, Facil J M, Civera J, et al.DynaSLAM:Tracking, mapping, and inpainting in dynamic scenes[J].IEEE Robotics and Automation Letters, 2018, 3(4):4076-4083. [17] He K M, Gkioxari G, Dollar P, et al.Mask R-CNN[C]//IEEE International Conference on Computer Vision.Piscataway, USA:IEEE, 2017:2980-2988. [18] Yu C, Liu Z X, Liu X J, et al.DS-SLAM:A semantic visual SLAM towards dynamic environments[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems.Piscataway, USA:IEEE, 2018:1168-1174. [19] Badrinarayanan V, Kendall A, Cipolla R.SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495. [20] 姚二亮,张合新,宋海涛,等.基于语义信息和边缘一致性的鲁棒SLAM算法[J].机器人, 2019, 41(6):751-760.Yao E L, Zhang H X, Song H T, et al.Robust SLAM algorithm based on semantic information and edge consistency[J].Robot, 2019, 41(6):751-760. [21] Redmon J, Farhadi A.YOLOv3:An incremental improvement[DB/OL].(2018-04-08)[2018-11-25].http://arxiv.org/abs/1804.02767. [22] Wang H, Wang L, Fang B F.Robust visual odometry using semantic information in complex dynamic scenes[M]//Communications in Computer and Information Science, Vol.1397.Berlin, Germany:Springer, 2020:594-601. [23] Fang B F, Mei G F, Yuan X H, et al.Visual SLAM for robot navigation in healthcare facility[J].Pattern Recognition, 2021, 113.DOI:10.1016/j.patcog.2021.107822. [24] Bolya D, Zhou C, Xiao F Y, et al.YOLACT:Real-time instance segmentation[C]//IEEE/CVF International Conference on Computer Vision.Piscataway, USA:IEEE, 2019:9156-9165. [25] Shi G J, Gao B K, Zhang L.The optimized K-means algorithms for improving randomly-initialed midpoints[C]//2nd International Conference on Measurement, Information and Control.Piscataway, USA:IEEE, 2013:1212-1216. [26] Shi J B, Tomasi C.Good features to track[C]//IEEE Conference on Computer Vision and Pattern Recognition.Piscataway, USA:IEEE, 1994:593-600. [27] Thrun S, Burgard W, Fox D.Probabilistic robotics[M].Cambridge, USA:MIT Press, 2005:94-96. [28] Yang X, Yuan Z K, Zhu D F, et al.Robust and efficient RGB-D SLAM in dynamic environments[J].IEEE Transactions on Multimedia, 2020, 23:4208-4219. [29] Dai W C, Zhang Y, Li P, et al.RGB-D SLAM in dynamic environments using point correlations[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(1):373-389.