迈进高维连续空间:深度强化学习在机器人领域中的应用

Step into High-Dimensional and Continuous Action Space: A Survey on Applications of Deep Reinforcement Learning to Robotics

摘要: 首先，对深度强化学习（DRL）的兴起与发展进行了回顾.然后，将用于高维连续动作空间的深度强化学习算法分为基于值函数近似的算法、基于策略近似的算法以及基于其他结构的算法3类，详细讲解了深度强化学习中的最新代表性算法及其特点，并重点阐述了其思路、优势及不足.最后，结合深度强化学习算法的发展方向，对使用深度强化学习方法解决机器人学问题的未来发展趋势进行了展望.

Abstract: Firstly, the emergence and development of DRL (deep reinforcement learning) are reviewed. Secondly, DRL algorithms used in high-dimensional and continuous action space are classified into value function approximation based algorithms, policy approximation based algorithms and other structures based algorithms. Then, typical DRL algorithms and their characteristics are introduced, especially their ideas, advantages and disadvantages. Finally, the future trends of applying DRL to robotics are forecasted according to the development directions of DRL algorithms.