RGB-D Visual Odometry Based on Features of Planes and Line Segments in Indoor Environments
DONG Xingliang1, YUAN Jing1, HUANG Shuzi1, YANG Shaokun1, ZHANG Xuebo1, SUN Fengchi2, HUANG Yalou2
1. College of Computer and Control Engineering, Nankai University, Tianjin 300071, China;
2. College of Software, Nankai University, Tianjin 300071, China
Abstract:Considering the structural characteristics of indoor environments, an RGB-D visual odometry algorithm is proposed, which uses features of planes and line segments. Normal vectors of RGB-D scanning points are used to cluster the 3D point cloud. The RANSAC (random sample consensus) algorithm is applied to plane fitting of each cluster of the 3D point set to extract plane features of environments. After that, an edge point detection algorithm is applied to segmenting the edge point set from environments and extracting line segment features of environments. And then, a feature matching algorithm based on the geometric constraint of planes and line segments is proposed to achieve the matching between features. If the results of the plane and line segment feature matching can provide sufficient pose constraints, the RGB-D camera pose shall be directly calculated by the matching relationship between features. Otherwise, the pose of RGB-D camera shall be estimated using the endpoints of matched line segments and the point set of line segments. Experimental results on the TUM (Technical University of Munich) datasets demonstrate that it can improve the accuracy of visual odometry estimation and environmental mapping to use the plane and line segment as the features of environments. Especially on the fr3/cabinet dataset, the root-mean-square error of rotation and translation of the proposed algorithm are 2.046°/s and 0.034m/s, respectively, which are significantly better than other classical visual odometry algorithms. Finally, the system is applied to the mobile robot to realize map building in real indoor environments. The system can build an accurate environmental map. The running speed of the system reaches 3 frames/s, which can meet the need of real-time processing.
[1] Ahn S, Wan K C. Efficient SLAM algorithm with hybrid visual map in an indoor environment[C]//International Conference on Control, Automation and Systems. Piscataway, USA:IEEE, 2007:663-667.
[2] 陈家乾,何衍,蒋静坪.自主移动机器人的室内结构化环境地图创建[J].控制理论与应用,2008,25(4):767-772.Chen J Q, He Y, Jiang J P. Map building with autonomous mobile robot in the structured indoor environment[J]. Control Theory and Applications, 2008, 25(4):767-772.
[3] 潘良晨,陈卫东.室内移动机器人的视觉定位方法研究[J].机器人,2008,28(5):504-509.Pan L C, Chen W D. Vision-based localization of indoor mobile robot[J]. Robot, 2008, 28(5):504-509.
[4] Kim D H, Kim J H. Effective background model-based RGB-D dense visual odometry in a dynamic environment[J]. IEEE Transactions on Robotics, 2016, 32(6):1565-1573.
[5] Lu Y, Song D. Robustness to lighting variations:An RGB-D indoor visual odometry using line segments[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2015:688-694.
[6] Kim P, Lim H, Kim H J. Robust visual odometry to irregular illumination changes with RGB-D camera[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2015:3688-3694.
[7] Rusinkiewicz S, Hall-Holt O, Levoy M. Real-time 3D model acquisition[J]. ADM Transactions on Graphics, 2002, 21(3):438-446.
[8] Pomerleau F, Magnenat S, Colas F, et al. Tracking a depth camera:Parameter exploration for fast ICP[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2011:3824-3829.
[9] Henry P, Krainin M, Herbst E, et al. RGB-D mapping:Using Kinect-style depth cameras for dense 3D modeling of indoor environments[M]//Experimental Robotics. Berlin, Germany:Springer-Verlag, 2014:647-663.
[10] Newcombe R A, Izadi S, Hilliges O, et al. KinectFusion:Real-time dense surface mapping and tracking[C]//IEEE International Symposium on Mixed and Augmented Reality. Piscataway, USA:IEEE, 2012:127-136.
[11] Izadi S, Kim D, Hilliges O, et al. KinectFusion:Real-time 3D reconstruction and interaction using a moving depth camera[C]//ACM Symposium on User Interface Software and Technology. New York, USA:ACM, 2011:559-568.
[12] Gutierrez-Gomez D, Mayol-Cuevas W, Guerrero J J. Inverse depth for accurate photometric and geometric error minimisation in RGB-D dense visual odometry[C]//IEEE International Conference on Robotics and Automation. IEEE, 2015:83-89.
[13] Whelan T, Salas-Moreno R F, Glocker B, et al. ElasticFusion:Real-time dense SLAM and light source estimation[J]. International Journal of Robotics Research, 2016, 35(14):1697-1716.
[14] Weingarten J, Siegwart R. 3D SLAM using planar segments[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2007:3062-3067.
[15] Pathak K, Birk A, Vaškevicius N, et al. Fast registration based on noisy planes with unknown correspondences for 3-D mapping[J]. IEEE Transactions on Robotics, 2010, 26(3):424-441.
[16] Trevor A J B, Rogers J G, Christensen H I. Planar surface SLAM with 3D and 2D sensors[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2012:3041-3048.
[17] Taguchi Y, Jian Y D, Ramalingam S, et al. Point-plane SLAM for hand-held 3D sensors[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2013:5182-5189.
[18] Ataercansizoglu E, Taguchi Y, Ramalingam S, et al. Tracking an RGB-D camera using points and planes[C]//IEEE International Conference on Computer Vision Workshops. Piscataway, USA:IEEE, 2014:51-58.
[19] Zhang L, Chen D, Liu W. Point-plane SLAM based on line-based plane segmentation approach[C]//IEEE International Conference on Robotics and Biomimetics. Piscataway, USA:IEEE, 2017:1287-1292.
[20] 李海丰,胡遵河,陈新伟.PLP-SLAM:基于点、线、面特征融合的视觉SLAM方法[J].机器人,2017,39(2):214-220.Li H F, Hu Z H, Chen X W. PLP-SLAM:A visual SLAM method based on point-line-plane feature fusion[J]. Robot, 2017, 39(2):214-220.
[21] Wang W, Yang J, Muntz R R. STING:A statistical information grid approach to spatial data mining[C]//International Conference on Very Large Data Bases. Burlington, USA:Morgan Kaufmann, 1997:186-195.
[22] Choi C, Trevor A J B, Christensen H I. RGB-D edge detection and edge-based registration[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2013:1568-1575.
[23] Holz D, Holzer S, Rusu R B, et al. Real-time plane segmentation using RGB-D cameras[M]//RoboCup 2011:Robot Soccer World Cup XV. Berlin, Germany:Springer-Verlage, 2011:306-317.
[24] Derose T, Duchamp T, Mcdonald J, et al. Surface reconstruction from unorganized points[J]. ACM SIGGRAPH Computer Graphics, 1992, 26(2):71-78.
[25] Liu J, Gong X, Liu J. Guided inpainting and filtering for Kinect depth maps[C]//International Conference on Pattern Recognition. Piscataway, USA:IEEE, 2012:2055-2058.
[26] Park C S, Kim S W, Kim D, et al. Comparison of plane extraction performance using laser scanner and Kinect[C]//Interna-tional Conference on Ubiquitous Robots and Ambient Intelligence. Piscataway, USA:IEEE, 2012:153-155.
[27] Suttasupa Y, Sudsang A, Niparnan N. Plane detection for Kinect image sequences[C]//IEEE International Conference on Robotics and Biomimetics. Piscataway, USA:IEEE, 2012:970-975.
[28] Schnabel R, Wahl R, Klein R. Efficient RANSAC for point-cloud shape detection[J]. Computer Graphics Forum, 2010, 26(2):214-226.
[29] Ramalingam S, Taguchi Y. A theory of minimal 3D point to 3D plane registration and its generalization[M]. International Journal of Computer Vision, 2013, 102(1/2/3):73-90.
[30] Besl P J, McKay N D. A method for registration of 3-D shapes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(2):239-256.
[31] Goldman R. Pyramid algorithms[M]. Burlington, USA:Morgan Kaufmann, 2002.
[32] Sturm J, Engelhard N, Endres F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2012:573-580.
[33] Ioannou Y, Taati B, Harrap R, et al. Difference of normals as a multi-scale operator in unorganized point clouds[C]//2nd International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission. Piscataway, USA:IEEE, 2012:501-508.
[34] Kerl C, Sturm J, Cremers D. Robust odometry estimation for RGB-D cameras[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2013:3748-3754.
[35] Engel J, Sturm J, Cremers D. Semi-dense visual odometry for a monocular camera[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2014:1449-1456.
[36] Engel J, Schöps T, Cremers D. LSD-SLAM:Large-scale direct monocular SLAM[C]//European Conference on Computer Vision. Berlin, Germany:Springer-Verlag, 2014:834-849.
[37] Whelan T, Kaess M, Fallon M, et al. Incremental and batch planar simplification of dense point cloud maps[J]. Robotics and Autonomous Systems, 2012, 69(suppl.):3-14.
[38] Zhou Y, Kneip L, Li H. Semi-dense visual odometry for RGB-D cameras using approximate nearest neighbour fields[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2017:6261-6268.
[39] Mur-Artal R, Tardós J D. ORB-SLAM2:An open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2016, 33(5):1255-1262.
[40] 高翔,张涛,刘毅,等.视觉SLAM十四讲:从理论到实践[M].北京:电子工业出版社,2017:157-172.Gao X, Zhang T, Liu Y, et al. Fourteen chapters of visual SLAM:From theory to practice[M]. Beijing:Publishing House of Electronics Industry, 2017:157-172.