陈卫东, 席裕庚, 顾冬雷. 自主机器人的强化学习研究进展[J]. 机器人, 2001, 23(4): 379-384.
引用本文: 陈卫东, 席裕庚, 顾冬雷. 自主机器人的强化学习研究进展[J]. 机器人, 2001, 23(4): 379-384.
CHEN Wei-dong, XI Yu-geng, GU Dong-lei. A SURVEY OF REINFORCEMENT LEARNING IN AUTONOMOUS MOBILE ROBOTS[J]. ROBOT, 2001, 23(4): 379-384.
Citation: CHEN Wei-dong, XI Yu-geng, GU Dong-lei. A SURVEY OF REINFORCEMENT LEARNING IN AUTONOMOUS MOBILE ROBOTS[J]. ROBOT, 2001, 23(4): 379-384.

自主机器人的强化学习研究进展

A SURVEY OF REINFORCEMENT LEARNING IN AUTONOMOUS MOBILE ROBOTS

  • 摘要: 虽然基于行为控制的自主机器人具有较高的鲁棒性,但其对于动态环境缺乏必要的自适应能力.强化学习方法使机器人可以通过学习来完成任务,而无需设计者完全预先规定机器人的所有动作,它是将动态规划和监督学习结合的基础上发展起来的一种新颖的学习方法,它通过机器人与环境的试错交互,利用来自成功和失败经验的奖励和惩罚信号不断改进机器人的性能,从而达到目标,并容许滞后评价.由于其解决复杂问题的突出能力,强化学习已成为一种非常有前途的机器人学习方法.本文系统论述了强化学习方法在自主机器人中的研究现状,指出了存在的问题,分析了几种问题解决途径,展望了未来发展趋势.

     

    Abstract: Even though autonomous mobile robots based on behaviour approaches are robust for many tasks and environments, they are not necessarily adaptive for dynamic environments. Reinforcement learning (RL) offers a powerful set of techniques that allow a robot to learn a task without requiring its designer to fully specify how it should be carried out. RL is a novel approach to machine intelligence that combines dynamic programming and supervised learning. RL is widely noticed as a promising method for robot learning because of its following advantages: (1) Behaviors of robot can be acquired only by assigning rewards and punishments; (2) Rewards and punishments can have a delay. Firstly, we analyzed the basic problems in robotic learning, and then introduced the principle and basic algorithms of RL, thirdly discussed several important problems and some solving approaches, in the end, pointed out the future development direction.

     

/

返回文章
返回