基于强化学习的空间机器人在轨装配柔顺控制方法

Compliant Control Method for On-orbit Assembly by Space Robots Based on Reinforcement Learning

  • 摘要: 为应对空间机器人在轨装配中装配对象振动、机器人-结构动力学耦合问题,改善现有方法参数调整困难和控制性能差等不足,提出一种结合阻抗控制与深度强化学习的模型-数据混合驱动方法,实现空间机器人装配策略的高效学习。首先,建立拼接式空间望远镜的模块化在轨装配场景,分析自由漂浮空间机器人和装配对象之间的动力学耦合现象,并将模块装配问题描述为马尔可夫决策过程。然后,以空间机器人关节阻抗控制作为先验模型,构建基于深度强化学习的装配策略学习方法,以应对空间机器人和装配对象之间的动力学耦合效应。最后,使用近端策略优化算法,完成装配策略的学习。为实现装配策略学习方法的快速验证,使用Isaac Gym软件构建了空间机器人在轨装配的并行化训练与测试环境。通过仿真与分析,验证了所提方法在提高柔顺控制性能和应对不确定性时的鲁棒性。

     

    Abstract: To address challenges such as vibration of assembly components, dynamic coupling between robots and structures in on-orbit assembly by space robots, as well as difficulties in parameter tuning, and suboptimal control performance in existing methods, a model-data hybrid driving approach is proposed that integrates impedance control with deep reinforcement learning to enable efficient learning of assembly strategies. Firstly, a modular on-orbit assembly scenario for a segmented space telescope is established, and the dynamic coupling between the free-floating space robot and the assembly components is analyzed. The modular assembly task is then formulated as a Markov decision process. Subsequently, joint impedance control of the space robot is introduced as a prior model, and a deep reinforcement learning-based assembly strategy learning method is developed to address the dynamic coupling effects between space robot and assembly components. Finally, the proximal policy optimization (PPO) algorithm is employed to learn the assembly strategy. To facilitate rapid validation of the proposed assembly strategy learning method, a parallelized training and testing environment for on-orbit assembly by space robots is constructed using Isaac Gym. Simulations and analyses demonstrate the proposed method's effectiveness in improving compliant control performance and robustness against uncertainty.

     

/

返回文章
返回