For robot's hand-eye coordinated grasping, a 3D modeling method for common objects in the household environment is proposed. By simultaneously collecting RGB image and depth image from the RGB-D sensor, feature points and feature descriptors are extracted from the RGB image. The correspondences between adjacent frames are set up through matching of the feature descriptors. The RANSAC (RANdom SAmple Consensus) based three point algorithm is used to compute the relative pose between adjacent frames. Based on loop closure, the result is refined by minimizing the re-projection error with Levenberg-Marquardt algorithm. With this method, object's dense 3D point cloud model can be obtained simply by placing the object on a plane table, and collecting ten to twenty frames data around the object. 3D models are set up for twenty household objects which are appropriate for the service robot to grasp. The experiment results show that the error is about 1mm for models with diameters between 5cm and 7cm, which fully satisfies the requirements for the pose determination in robot grasping.
[1] Furukawa Y, Curless B, Seitz S M, et al. Towards Internet-scale multi-view stereo[C]//23rd IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2010: 1434-1441.[2] Henry P, Krainin M, Herbst E, et al. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments[J]. International Journal of Robotics Research, 2012, 31(5): 647-663. [3] Izadi S, Kim D, Hilliges O, et al. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera[C]//24th Annual ACM Symposium on User Interface Software and Technology. New York, NY, USA: ACM, 2011: 559-568.[4] Krainin M, Henry P, Ren X F, et al. Manipulator and object tracking for in-hand 3D object modeling[J]. International Journal of Robotics Research, 2011, 30(11): 1311-1327. [5] Arun K S, Huang T S, Blostein S D. Least-squares fitting of two 3-D point sets[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, 9(5): 698-700. [6] Besl P J, McKay N D. A method for registration of 3-D shapes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(2): 239-256. [7] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. [8] Bay H, Ess A, Tuytelaars T, et al. Speeded-up robust features (SURF)[J]. Computer Vision and Image Understanding, 2008, 110(3): 346-359. [9] Ozuysal M, Calonder M, Lepetit V, et al. Fast keypoint recognition using random ferns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(3): 448-461.