A Large Viewing Angle 3-Dimensional V-SLAM Algorithm with a Kinect-based Mobile Robot System
XIN Jing1, GOU Jiaolong1, MA Xiaomin1, HUANG Kai1, LIU Ding1, ZHANG Youmin2
1. School of Automation & Information Engineering, Xi'an University of Technology, Xi'an 710048, China;
2. Department of Mechanical and Industrial Engineering, Concordia University, Montreal H3G 1M8, Canada
To solve the performance degradation problem of the mobile robot 3D V-SLAM (visual simultaneous localization and mapping) in the presence of large viewing angle, an affine invariant features matching algorithm AORB (affine oriented FAST and rotated BRIEF) is proposed, and a mobile robot large viewing angle 3D V-SLAM system using Kinect camera is further developed. Firstly, AORB algorithm is adopted to implement the fast and efficient matching between adjacent frames captured by the Kinect RGB camera in the presence of large changes of viewing angle, and the corresponding relationship between adjacent frames is created. Secondly, 2D image points are converted into 3D color cloud data through using the calibrated intrinsic and extrinsic parameters of Kinect, and pixel depth values after alignment correction. Thirdly, the relative pose between adjacent frames is computed by using the least-squares algorithm after removing outliers using RANSAC (RANdom Sample Consensus). Finally, the 3D model is obtained by optimizing the resulting pose using g2o (general graph optimization). Mobile robot large viewing angle 3D V-SLAM is realized ultimately. Both the off-line (based on well-known and available benchmark data sets) and the online (with a developed mobile robot system) experimental testing show that the proposed matching algorithm and the developed 3D V-SLAM system can accurately update the local model, successfully reconstruct the environment model, and effectively estimate the motion trajectory of the mobile robot in the presence of large viewing angle.
[1] Endres F, Hess J, Engelhard N, et al. An evaluation of the RGB-D SLAM system[J]. Perception, 2012, 3(c): 1691-1696.[2] 连晓峰.移动机器人及室内环境3维模型重建技术[M].北京:国防工业出版社,2010: 19-21. Lian X F. Mobile robots and 3D reconstruction techniques of indoor environment[M]. Beijing: National Defense Industry Press, 2010: 19-21.[3] Konolige K, Mihelich P. Technical description of Kinect calibration[N/OL]. [2011-11-03]. http://www.ros.org/wiki/kinect_calibration/technical.[4] Henry P, Krainin M, Herbst E, et al. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments[J]. International Journal of Robotics Research, 2012, 31(5): 647-663. [5] Besal P J, McKay H D. A method for registration of 3-D shapes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(2): 239-256. [6] 杨扬,曹其新,朱笑笑,等.面向机器人手眼协调抓取的3维建模方法[J].机器人,2013,35(2): 151-155.YangY, Cao Q X, Zhu X X, et al. A 3D modeling method for robot's hand-eye coordinated grasping[J]. Robot, 2013, 35(2): 151-155.[7] Dryanovski I, Valenti R G, Xiao J Z. Fast visual odometry and mapping from RGB-D data[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2013: 2305-2310.[8] Fischler M A, Bolles R C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6): 381-395. [9] Kümmerle R, Grisetti G, Strasdat H, et al. G2o: A general framework for graph optimization[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2011: 3607-3613.[10] 陈方,许允喜.基于二进制比特串描述符的惯性组合导航高速景象匹配算法[J].光子学报,2011,40(8): 1238-1242. Chen F, Xu Y X. High-speed scence matching algorithm based on BRIEF descriptor for INS integrated navigation system[J]. Acta Photonica Sinica, 2011, 40(8): 1238-1242.[11] Rosten E, Porter R, Drummond T. Faster and better: A machine learning approach to corner detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(1): 105-119. [12] Calonder M, Lepetit V, Strecha C, et al. BRIEF binary robust independent elementary feature[C]//European Conference on Computer Vision. Berlin, Germany: Springer, 2010: 778-792.[13] Rublee E, Rabaud V, Konolige K, et al. ORB: An efficient alternative to SIFT or SURF[C]//IEEE International Conference on Computer Vision. Piscataway, USA: IEEE, 2011: 2564-2571.[14] Yu G S, Morel J M. A fully affine invariant image comparison method[C]//IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway, USA: IEEE, 2009: 1597-1600.[15] Herrera C D, Kannala J, Heikkila J. Joint depth and color camera calibration with distortion correction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(10): 2058-2064. [16] Khoshelham K, Elberink S O. Accuracy and resolution of Kinect depth data for indoor mapping applications[J]. Sensors, 2012, 12(2): 1437-1454.[17] Umeyama S. Least-squares estimation of transformation parameters between two point patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991, 13(4): 376-380. [18] Sturm J, Engelhard N, Endres F, et al. A benchmark for the evaluation of RGB-D SLAM systems[DB/OL]. [2012-10-08]. http://vision.in.tum.de/data/datasets/rgbd-dataset/.