张晋, 唐进, 尹建芹. 面向人体动作预测的对称残差网络[J]. 机器人, 2022, 44(3): 291-298. DOI: 10.13973/j.cnki.robot.210188
引用本文: 张晋, 唐进, 尹建芹. 面向人体动作预测的对称残差网络[J]. 机器人, 2022, 44(3): 291-298. DOI: 10.13973/j.cnki.robot.210188
ZHANG Jin, TANG Jin, YIN Jianqin. Symmetric Residual Network for Human Motion Prediction[J]. ROBOT, 2022, 44(3): 291-298. DOI: 10.13973/j.cnki.robot.210188
Citation: ZHANG Jin, TANG Jin, YIN Jianqin. Symmetric Residual Network for Human Motion Prediction[J]. ROBOT, 2022, 44(3): 291-298. DOI: 10.13973/j.cnki.robot.210188

面向人体动作预测的对称残差网络

Symmetric Residual Network for Human Motion Prediction

  • 摘要: 为了研究不同残差连接方式对人体动作预测卷积神经网络的影响,探讨了在保持网络深度一定的情况下,如何利用残差连接构成一个高效捕捉人体动作特征的预测模型。通过观察人体骨骼关节点排列方式,提出一种适用于人体骨骼关节点预测的对称残差连接方法,并基于该方法设计了对称残差块(symmetric residual block,SRB)。所设计的SRB,最后一层卷积核的感受野达到最大,覆盖了人体全部关节信息,采用的对称连接方式高效地利用浅层动态特征,使预测的效果更好、模型使用的参数更少。此外,本文提出一种基于2个SRB和1个解码器的端到端卷积网络——对称残差网络(symmetric residual network,SRNet),取得的预测结果高于基准方法。最后,在TensorFlow框架下利用公开数据集Human3.6M和CMU-Mocap进行了人体动作预测实验。其结果表明,与基准方法相比,本文方法的关节位置平均误差(mean per joint postion error,MPJPE)在各个预测时间点上均有0.2mm~1mm的降低,验证了本文提出的SRNet能有效建模人体姿态的全局空间特征。

     

    Abstract: To study the influence of different residual connection methods on CNN (convolutional neural network) for human motion prediction, this paper investigates how to use residual connection to construct an effective prediction model for capturing the human motion features by the network with a certain depth. Through observing the arrangement of human skeletal joints, a symmetric residual connection method is proposed for predicting the human skeletal joints, and a symmetric residual block (SRB) is designed based on the proposed method. In the designed SRB, the receptive field of the last convolution kernel is maximized, covering all the joint information of the human body. The symmetric connection method is adopted to efficiently utilize the shallow dynamic features, and consequently improve the prediction performance and reduce the model parameters. Based on two SRBs and one decoder, an end-to-end convolutional network is proposed, named as symmetric residual network (SRNet), by which a higher accuracy is achieved comparing with the baseline methods. In the framework of TensorFlow, human motion prediction experiments are carried out on two public datasets, Human3.6M and CMU-Mocap. The results indicate that, the proposed method reduces the mean per joint position error (MPJPE) by 0.2 mm~1 mm at each prediction time point comparing with the baseline methods, which confirms the effectiveness of the proposed SRNet for modeling the human global spatial features.

     

/

返回文章
返回