一种基于功用性图的目标推抓技能自监督学习方法

吴培良; 刘瑞军; 毛秉毅; 史浩洋; 陈雯柏; 高国伟

doi:10.13973/j.cnki.robot.210265

一种基于功用性图的目标推抓技能自监督学习方法

A Self-supervised Learning Method of Target Pushing-Grasping Skills Based on Affordance Map

摘要

摘要: 提出了一种基于功用性图的目标推抓技能自监督学习方法。首先，给出了杂乱环境下面向目标推抓任务的机器人技能自监督学习问题描述，将工作空间中机器人推抓操作的决策过程定义为一个全新的马尔可夫决策过程（MDP），分别训练视觉机制模块与动作机制模块。其次，在视觉机制模块中融合自适应参数与分组拆分注意力模块设计了特征提取网络RGSA-Net，可由输入网络的原始状态图像生成功用性图，为目标推抓操作提供良好的前提。然后，在动作机制模块中搭建了基于演员－评论家（actor-critic）框架的深度强化学习自监督训练框架DQAC，机器人根据功用性图执行动作后利用该框架进行动作评判，更好地实现了推、抓之间的协同。最后，进行了实验对比与分析，验证了本文方法的有效性。

Abstract: A self-supervised learning method of target pushing-grasping skills based on affordance map is presented. Firstly, the self-supervised learning problem is described for robot to learn target pushing-grasping skills in cluttered environment. The decision process of robot pushing and grasping operation in workspace is defined as a new Markov decision process (MDP), in which the vision mechanism module and action mechanism module are trained separately. Secondly, the adaptive parameters and group split attention module are fused in the vision mechanism module to design the feature extraction network RGSA-Net, which can generate the affordance map from the original state image of the input network, and provide a good premise for the target pushing-grasping operation. Then, a deep reinforcement learning based self-supervised training framework DQAC based on actor-critic framework is built in the action mechanism module. After the robot performs the action according to the affordance map, the DQAC framework is used to evaluate the action, and thus better cooperation between pushing and grasping is realized. Finally, experimental comparison and analysis are carried out to verify the effectiveness of the proposed method.

HTML全文

参考文献(31)

施引文献

资源附件(1)