一种新的多智能体强化学习算法及其在多机器人协作任务中的应用

A NEW MULTI-AGENT REINFORCEMENT LEARNING ALGORITHM AND ITS APPLICATION TO MULTI-ROBOT COOPERATION TASKS

摘要: 在多机器人系统中，评价一个机器人行为的好坏常常依赖于其它机器人的行为，此时必须采用组合动作以实现多机器人的协作，但采用组合动作的强化学习算法由于学习空间异常庞大而收敛得极慢．本文提出的新方法通过预测各机器人执行动作的概率来降低学习空间的维数，并应用于多机器人协作任务之中．实验结果表明，基于预测的加速强化学习算法可以比原始算法更快地获得多机器人的协作策略．

Abstract: In multi-robot systems, joint-action must be employed to achieve cooperation because the evaluation to the behavior of a robot often depends on the other robots'behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by joint-action. In this paper, a prediction-based reinforcement learning algorithm is presented for multi-robot cooperation tasks, which demands all robots to learn paper predict the probabilities of actions that other robots may execute. A multi-robot cooperation experiment is made to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation strategy much faster than the primitive reinforcement learning algorithm.