Visual Recognition of Human Pose for the Transfer-care Assistant Robot
LIU Jinyue1,2,3, LI Shunda1,2,3, CHEN Mengqian1,2,3, GUO Shijie1,2,3
1. School of Mechanical Engineering, Hebei University of Technology, Tianjin 300132, China;
2. State Key Laboratory of Reliability and Intelligentization of Electrical Equipment, Tianjin 300132, China;
3. Hebei Key Laboratory of Robot Perception and Human-machine Fusion, Tianjin 300132, China
Abstract:A two-level convolution neural network algorithm based on RGB-D (RGB-depth) information is proposed to meet the requirements of high accuracy and close-range adaptability of human pose detection for the transfer-care assistant robot system. The first-level network is used to calculate the human joint pixel coordinates in color image, and the human joint coordinates in color image are mapped to the depth map coordinates to calculate the joint heatmap. A convolution neural network structure is proposed as the second level network for inputting the depth image and joint heatmap to estimate the 3D human joint position. Based on the position of human joint point, the subaxillary point is further calculated by the image segmentation method. The experimental results show that the time for a single calculation by the proposed method is 210 ms, the accuracy of human pose recognition reaches 91.5% in the application environment of transfer-care assistant robot and 90.3% in the close-range environment. The proposed method can accurately estimate the global coordinates of human pose in real time.
[1] 国家卫生计生委统计信息中心.2013第五次国家卫生服务调查分析报告[M].北京:中国协和医科大学出版社,2015. ewline Statistics and Information Center of NHFPC. The fifth national health service survey and analysis report[M]. Beijing:Peking Union Medical College Press, 2015.
[2] Mukai T, Hirano S, Nakashima H, et al. Development of a nursing-care assistant robot RIBA that can lift a human in itsarms[C]//IEEE/RSJ International Conference on Intelligent Ro-bots and Systems. Piscataway, USA:IEEE, 2010:5996-6001.
[3] Noakes M W, Lind R F, Jansen J F, et al. Development of a remote trauma care assist robot[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2009:2580-2585.
[4] 陈贵亮,刘玉鑫,郭士杰,等.一种移乘搬运护理机器人:201720534050.7[P].2017-05-15. Chen G L, Liu Y X, Guo S J, et al. A transfer-care assistant robot:201720534050.7[P]. 2017-05-15.
[5] Pishchulin L, Insafutdinov E, Tang S, et al. DeepCut:Jointsubset partition and labeling for multi person pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2016:4929-4937.
[6] 马淼,李贻斌.基于多级动态模型的2维人体姿态估计[J].机器人,2016,38(5):578-587. Ma M, Li Y B. 2D human pose estimation using multi-level dynamic model[J]. Robot, 2016, 38(5):578-587.
[7] Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation[C]//14th European Conference on Computer Vision. Berlin, Germany:Springer, 2016:483-499.
[8] Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2017:1302-1310
[9] Martinez J, Hossain R, Romero J, et al. A simple yet effective baseline for 3D human pose estimation[C]//IEEE InternationalConference on Computer Vision. Piscataway, USA:IEEE, 2017:2659-2668.
[10] 李斌.单幅深度图人体关节点定位[D].武汉:华中科技大学,2012.Li B. Key points of human body location based on single depth map[D]. Wuhan:Huazhong University of Science and Technology, 2012.
[11] Shotton J, Sharp T, Kipman A, et al. Real-time human pose recognition in parts from single depth images[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2011:1297-1304.
[12] Jung H Y, Lee S, Heo Yong S, et al. Random tree walk toward instantaneous 3D human pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2015:2467-2474.
[13] Arduengo García M, Jorgensen S J, Hambuchen K, et al. ROS wrapper for real-time multi-person pose estimation with a single camera[R]. Catalonia, Spain:Polytechnic University of Catalonia, 2017.
[14] Moon G, Chang J Y, Suh Y, et al. Holistic planimetric prediction to local volumetric prediction for 3D human pose estima-tion[EB/OL]. (2017-07-08)[2019-01-01]. https://arxiv.org/abs/1706.04758.
[15] Zimmermann C, Welschehold T, Dornhege C, et al. 3D human pose estimation in RGBD images for robotic task learning[EB/OL]. (2018-03-13)[2019-01-01]. https://arxiv.org/pdf/1803.02622.
[16] Hidalgo G, Cao Z, Simon T, et al. OpenPose[CP/OL]. (2017-04-24)[2018-05-02]. https://github.com/CMU-Perceptual-Compu-ting-Lab/openpose.
[17] Carreira J, Agrawal P, Fragkiadaki K, et al. Human pose estimation with iterative error feedback[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2016:4733-4742.
[18] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2019-01-01]. https://arxiv.org/abs/1409.1556.
[19] He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916.
[20] Haque A, Peng B, Luo Z, et al. Towards viewpoint invariant 3D human pose estimation[C]//14th European Conference on Computer Vision. Berlin, Germany:Springer, 2016:160-177.
[21] Otsu N. A threshold selection method from gray-level histograms[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1979, 9(1):62-66.