A robust RGB-D image alignment method is proposed for autonomously building 3D maps for a service robot equipped with inexpensive RGB-D cameras. The transformation between frames is calculated based on matched sets of feature points, and the RANdom SAmpling Consensus (RANSAC) algorithm is used to eliminate false matchings, and the algorithm's inlier counting policy is modified to adapt to spatial nonuniformity of feature points. Meanwhile, floor information is detected and the coplanar constraint is used to enhance alignment of point sets. Experiments are conducted on RGB-D image sequence collected by robot in a real indoor environment. The error rate of frame-to-frame alignment is zero, and the global floor error is less than 2 cm. The 3D mapping process can be implemented accurately and continuously. Results show that the floor information can effectively improve the global precision of the map, and the method is robust and accurate.
[1] May S, Droeschel D, Holz D, et al. Three-dimensional mapping with time-of-flight cameras[J]. Journal of Field Robotics, 2009, 26 (11): 934-965.[2] Konolige K, Agrawal M. Three-dimensional mapping with time-of-flight cameras[J]. IEEE Transactions on Robotics, 2008, 24 (5): 1066-1077.[3] Clemente L A, Davison A J, Reid I D, et al. Mapping large loops with a single hand-held camera[C]//Robotics: Science and Systems. 2007.[4] Henry P, Krainin M, Herbst E, et al. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments[J]. International Journal of Robotics Research, 2012, 31(5): 647-663. [5] Newcombe R A, Davison A J, Izadi S, et al. KinectFusion: Realtime dense surface mapping and tracking[C]//IEEE International Symposium on Mixed and Augmented Reality. Piscataway, USA: IEEE, 2011: 127-136.[6] Whelan T, Johannsson H, Kaess M, et al. Robust real-time visual odometry for dense RGB-D mapping[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2013: 5724-5731.[7] 杨扬,曹其新,朱笑笑,等.面向机器人手眼协调抓取的 3 维建模方法[J].机器人,2013,35(2):151-155. Yang Y, Cao Q X, Zhu X X, et al. A 3D modeling method for robot's hand-eye coordinated grasping[J]. Robot, 2013, 35(2): 151-155.[8] Acharya P K, Henderson T C. Parameter estimation and error analysis of range data[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 1988: 1709-1714.[9] Arun K S, Huang T S, Blostein S D. Least-squares fitting of two 3-D point sets[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, 9(5): 698-700.[10] Lucas B D, Kanade T, et al. An iterative image registration technique with an application to stereo vision[C]//International Joint Conference on Artificial Intelligence. New York, USA: ACM, 1981: 674-679.[11] Shi J, Tomasi C. Good features to track[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 1994: 593-600.[12] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. [13] Bay H, Tuytelaars T, Van Gool L. SURF: Speeded up robust features[C]//European Conference on Computer Vision. 2006: 404-417.[14] Grisetti G, Stachniss C, Grzonka S, et al. A tree parameterization for efficiently computing maximum likelihood maps using gradient descent[C]//Robotics: Science and Systems. 2007.[15] Lorensen W E, Cline H E. Marching cubes: A high resolution 3D surface construction algorithm[C]//ACM SIGGRAPH Computer Graphics. New York, USA: ACM, 1987: 163-169.