基于拓扑图和大语言模型的灵活可控社交导航

杨宜凡; 张千一; 宋一诺; 朱泽卿; 刘景泰

doi:10.13973/j.cnki.robot.240248

基于拓扑图和大语言模型的灵活可控社交导航

Flexible and Controllable Social Navigation Based on Topological Graph and Large Language Model

摘要

摘要: 社交导航要求机器人在理解复杂环境和人类社交规则的前提下灵活决策，摆脱对特定模型函数的依赖，充分利用广泛的世界知识。为此，提出了一种基于拓扑图与大语言模型的通用导航架构。首先，基于障碍物聚类和图论提出了一种环境理解方法来为机器人提供候选引导点。其次，利用大语言模型的角色扮演和少样本闭环优化等机制决策出最优引导点，并以引导点为目标生成并优化轨迹。最后，在多个不同的静态和动态场景中进行实验验证，并在4个大语言模型上完成测试。结果表明，引导点与传统轨迹优化的结合使得导航是可控的，大模型的世界知识使机器人在运动效率与社交属性上达到了良好的平衡，其局部最优决策数所占比例达到了97.94%。

Abstract: Social navigation requires robots to make flexible decisions based on the understanding of complex environments and human social rules, get rid of dependence on specific model functions and make full use of extensive world knowledge. So a general navigation framework based on topological graphs and large language models is proposed. Firstly, an environment understanding method is developed based on obstacle clustering and graph theory to provide candidate guiding points for the robot. Secondly, role-playing and few-shot closed-loop optimization mechanisms of large language models are utilized to determine the optimal point, and trajectories are generated and optimized with the guiding point as the target. Finally, experimental verification is conducted in multiple static and dynamic scenes, and tests are performed on 4 large language models. The result shows that the navigation is controllable by combining the guiding points with traditional trajectory optimization. The world knowledge of large models enables the robot to achieve a good balance between motion efficiency and social attributes. The proportion of locally optimal decisions reaches 97.94%.

HTML全文

参考文献(31)

施引文献

资源附件(0)