Application of Reinforcement Learning to Basic Action Learning of Soccer Robot
DUAN Yong1, YANG Huai-qing1, CUI Bao-xia1, XU Xin-he2
1. School of Information Science and Engineering, Shenyang University of Technology, Shenyang 110178, China; 2. Institute of Artificial Intelligence and Robotics, Northeastern University, Shenyang 110004, China
Abstract:This paper discusses reinforcement learning(RL)algorithm and its application to technical action learning of soccer robot.In RL,since the state space and action space are too large or their variables are continuous,the learning speed are too slow and it is usually too hard for learning to converge.To solve this problem,an RL method based on T-S model fuzzy neural network is proposed,which can effectively perform the mapping from the state space to the action space of RL.Furthermore,the proposed method is used to design technical actions of soccer robot,and behavior learning of the robot without expert knowledge and environment model is discussed.Finally,experiments are made and the results show that the presented method is effective and it can meet the demands of robot soccer match.
[1] Camacho D,Fernandez F,Rodelgo M A.Roboskeleton:An architecture for coordinating robot soccer agents[J].Engineering Applications of Artificial Intelligence,2006,19(2):179~188.
[2] Sutton R S,Barto A G.Reinforcement Learning:An Inuoduction[M].Cambridge,MA,USA:MIT Press,1998.
[3] Bartlett P L.An introduction to rcinfotcement learning theory:Value function methods[J].Advanced Lectures on Machine Leaming,2003,2600:184~202.
[4] Jonffe L.Fuzzy inference system learning by reinforcement methods[J].IEEE Transactions on Systems,Man,and Cybernetics.Part C:Applications and Reviews,1998,28(3):338-355.
[5] Watkins C J C H,Dayan P.Technical note:Q-learning[J].Machine Learning,1992,8(3-4):279~292.
[6] 赵顺珍.基于神经网络的永磁同步电动机模糊控制[J].沈阳工业大学学报,2006,28(1):62~64,101.Zhao Shun-zhen.Fuzzy control based on neural network for permanent magnet synchronous motor[J].Journal of Shenyang University of Technology,2006,28(1):62~64,101.
[7] 粱中华,林志明,刘鑫,等.基于模糊控制的PWM整流器的抗负载扰动性能[J].沈阳工业大学学报,2007,29(6):711~715.Liang Zhong-hua,Lin Zhi-ming,Liu Xin,et al.Research on anti-disturbance performance for PWM rectifiers based on fuzzy control[J].Journal of Shenyang University of Technology,2007,29(6):711~715.
[8] Baird L C.Residual algorithms:Reinforcement learning with function approximation[A].Proceedings of the 12th International Conference on Machine Learning[C].San Francisco,C A,USA:Morgan Kaufmann Publishers.1995.30~37.
[9] Jung M J,Kim H S,Shim H S,et al.Fuzzy rule extraction for shooting action controller of soccer robot[A].Proceedings of the IEEE International Fuzzy Systems Conference[C].Piscataway,NJ,USA:IEEE,1999.556~561.
[10] Scone P,Sutton R S,Kuhlmann.G.Reinforcement learning for robocup soccer keepaway[J].Adaptive Behavior,2005,13(3):165~188.
[11] 顾晓锋,张代远.机器人足球比赛截球策略设计[J].计算机应用,2005,25(8):1858~1860.Gu Xiao-feng.Zhang Dai-yuan.Design of intercepting ball in the robot soccermatch[J].Computer Applications,2005,25(8):1858~1860.