李志奇, 王滨, 刘宏. 基于支持向量回归的乒乓球机器人击球策略学习方法[J]. 机器人, 2014, 36(1): 14-20. DOI: 10.3724/SP.J.1218.2014.00014
引用本文: 李志奇, 王滨, 刘宏. 基于支持向量回归的乒乓球机器人击球策略学习方法[J]. 机器人, 2014, 36(1): 14-20. DOI: 10.3724/SP.J.1218.2014.00014
LI Zhiqi, WANG Bin, LIU Hong. Learning Batting Policy for a Robot Table Tennis Player Based on Support Vector Regression[J]. ROBOT, 2014, 36(1): 14-20. DOI: 10.3724/SP.J.1218.2014.00014
Citation: LI Zhiqi, WANG Bin, LIU Hong. Learning Batting Policy for a Robot Table Tennis Player Based on Support Vector Regression[J]. ROBOT, 2014, 36(1): 14-20. DOI: 10.3724/SP.J.1218.2014.00014

基于支持向量回归的乒乓球机器人击球策略学习方法

Learning Batting Policy for a Robot Table Tennis Player Based on Support Vector Regression

  • 摘要: 针对7自由度仿人型乒乓球机器人的定点回球问题,提出了一种基于支持向量回归的击球策略学习方法.首先,把机器人的击球过程形式化为击球评 价函数,该函数以来球状态和击球轨迹参数为输入,以回报值为输出.然后,提出一种基于物理模型置信域的随机搜索方法以提高训练数据的采集 效率,并基于ε 支持向量回归(ε-SVR)对经验数据集进行泛化从而得到击球评价函数.最后,在决策过程中,采用多初值拟牛顿法最大化击球评价函数以求解出最优击球轨迹.将该方法应用于7自由度乒乓球机器人系统中,实验结果验证了其有效性.

     

    Abstract: A method based on support vector regression (SVR) is proposed to learn the batting policy to return the ball to a desired location for a 7-DoF (degree of freedom) anthropomorphic table tennis robot. Firstly, table tennis playing process is formalized as the batting evaluation function, which maps the state of the incoming ball and the parameters of the batting trajectory to the reward. Then, an exploration method based on the confidence region of the physical model is proposed to collect training data efficiently, and the batting evaluation function is obtained by generalizing the training data using ε -support vector regression (ε-SVR). Finally, the optimal batting trajectory is computed during decision process by maximizing the batting evaluation function using multi-start Quasi-Newton method. The proposed method is applied to a 7-DoF table tennis robot, and the results verifies its effectiveness.

     

/

返回文章
返回