复杂环境下基于 TCP-DQN 算法的低空飞行器动态航路规划

许振阳; 陈谋; 韩增亮; 邵书义

doi:10.13973/j.cnki.robot.240341

复杂环境下基于 TCP-DQN 算法的低空飞行器动态航路规划

Dynamic Path Planning of Low-altitude Aircraft Based on TCP-DQN Algorithm in Complex Environment

摘要

摘要: 针对深度强化学习算法在解决低空飞行器动态航路规划时出现的训练效率低、收敛速度慢以及航路可飞性差等问题，提出了一种基于目标导向课程学习和优先经验回放策略的深度Q网络（TCP-DQN）动态航路规划算法。首先，在强化学习算法框架中引入课程学习机制，通过设置目标引导机动策略，在提高算法训练速度的同时优化所规划航路的可飞性。其次，构建训练组合奖励函数以解决DQN奖励值稀疏问题，并通过优先回放低空飞行器避障经验来提高算法的学习效果。最后，给出了TCP-DQN算法在3维低空动态环境下的航路规划仿真结果。仿真结果表明，该算法能够快速地为低空飞行器在动态未知威胁环境中规划出安全高效的飞行航路。

Abstract: To address the issues of inefficient training, slow convergence, and poor path feasibility encountered by deep reinforcement learning algorithms in solving dynamic path planning for low-altitude aircraft, a TCP-DQN (target-guided curriculum learning and prioritized replay deep Q-network) based dynamic path planning algorithm is proposed. Firstly, a curriculum learning mechanism is introduced into the framework of reinforcement learning algorithms, where target-guided maneuver strategies are set to improve the training speed of the algorithm while optimizing the feasibility of the planned paths. Secondly, a combined reward function for training is constructed to resolve the sparsity problem of DQN reward values, and obstacle avoidance experiences of low-altitude aircraft are prioritized for replay to enhance the learning performance of the algorithm. Finally, simulation results of the TCP-DQN algorithm for path planning in 3D low-altitude dynamic environment are presented. The simulation results demonstrate that the algorithm can quickly plan the safe and efficient paths for low-altitude aircraft in dynamic and unknown threat environments.

HTML全文

参考文献(37)

施引文献

资源附件(0)