Abstract:Even though autonomous mobile robots based on behaviour approaches are robust for many tasks and environments, they are not necessarily adaptive for dynamic environments. Reinforcement learning (RL) offers a powerful set of techniques that allow a robot to learn a task without requiring its designer to fully specify how it should be carried out. RL is a novel approach to machine intelligence that combines dynamic programming and supervised learning. RL is widely noticed as a promising method for robot learning because of its following advantages: (1) Behaviors of robot can be acquired only by assigning rewards and punishments; (2) Rewards and punishments can have a delay. Firstly, we analyzed the basic problems in robotic learning, and then introduced the principle and basic algorithms of RL, thirdly discussed several important problems and some solving approaches, in the end, pointed out the future development direction.
[1] Brooks. A Robust Layed Control System for a Mobile Robot. IEEE Journal of Rob otics and Autonomous, 1986 [2] Brooks and Mataric. Real Robots,Real Learning Problems. Robot Learning, Connell & Mahadevan, Chapter, 1993, 8: 193-213 [3] Connell. Introduction to Robot Learning. Robot Learning, Connell & Mahadevan, Chapter,1993,1: 1-17 [4] Watkins C J, Dayan P. Q-learning. Machine Learning, 1992,8: 279-292 [5] R Sutton. DYNA: an Integrated Architecture for Learning: Planning and Reacti ng. In Working Notes of the AAAI Spring Symposium on Integrated Intelligent Arch itecture, March 1991 [6] Kaelbling L P. Learning in embedded systems. Ph.D thesis,Standford Universit y, 1990 [7] Mahadevan S, Connell J. Automatic Programming of Behavior-based Robots Using Reinforcement Learning. Artificial Intelligence, 1992,55: 311-365 [8] Minoru Asada. Purposive Behavior Acquisition for a Real Robot by Vision-base d Reinforcement Learning. Machine Learning, 1996,23: 279-303 [9] Lin L J. Self-improving Reactive Agents: Case Studies of Reinforcement Learn ing Rrameworks. From Animals to Animates: International Conference on Simulation of Adaptive Behavior, the MIT Press, 1991 [10] Lin L J. Programming Robots Using Reinforcement Learning and Teaching. Proc eedings, AAAI-91, Pittsburgh, PA, 1991: 781-786 [11] Maes P, Brooks R D. Learning to Coordinate Behaviors. Proceedings, AAAI-9 1, Boston MA, 1991: 796-802 [12] Ram A, Santamaria. Multistrategy Learning in Reactive Control System for Autonomous Robotics Navigation.Information,1993,17(4): 347-369 [13] Francois Michaud, Maja J.Mataric. Learning from History for Behavior-base d Mobile Robots in Non-stationary Condition. Machine learning, 1998, 1-29 [14] Carlos H C. Embedding a Priori Knowledge in Reinforcement Learning. Journal of Intelligent and Robotics Systems 21: 51-71 [15] Touzet. Neural Reinforcement Learning for Behaviour Shythesis. Robotics and Autonomous Systems. Special Issue on Learning Robot, the New Wave N, Sharkey Gu est Edistor, 1997 [16] Volker Klingspor. Learning Concepts from Sensor Data of a Mobile Robot. Mach ine Learning,1996,23: 305-322 [17] Richard Maclin, Jude W.Shavlik. Creating Advice-Taking Reinforcement Lear ners. Machine Learning, 1996,22: 251-281 [18] Thrun S, Mitchell T M. Lifelong Robot Learning. Technick Report IAI-TR-93-7. Department of Computer Science, University of Bonn, Germany, 1993