LUO Jiayuan, LIU Zeyang, LAN Xuguang. Long-horizon Task Planning Based on Multi-modal Diffusion Policy[J]. ROBOT, 2025, 47(4): 548-558. DOI: 10.13973/j.cnki.robot.250192
Citation: LUO Jiayuan, LIU Zeyang, LAN Xuguang. Long-horizon Task Planning Based on Multi-modal Diffusion Policy[J]. ROBOT, 2025, 47(4): 548-558. DOI: 10.13973/j.cnki.robot.250192

Long-horizon Task Planning Based on Multi-modal Diffusion Policy

  • In robotic operations for long-horizon tasks, the sequences of offline skill-learning actions are diverse, the relationships between natural language instruction comprehension and long-horizon task semantics are complex, and the information density is high. To address these challenges, a long-horizon task planning algorithm based on multi-modal diffusion policy (named MMDPP) is proposed to improve the task completion rate and robustness in complex environments. The method uses a large visual language model to transform natural language tasks into structured task elements, introduces a multimodal fusion module to model the low-dimensional state, image observation and task semantics in a unified way, and uses selective channels to reduce the gradient conflict and the gradient cross-interference. A conditional diffusion generation model is constructed on this basis to directly output structurally consistent and task-aligned action sequences, realizing endto-end strategy planning from language input to action prediction. In the MuJoCo-Kitchen-Image kitchen environment (selfconstructed dataset), the MMDPP method significantly outperforms the baseline method in long-horizon task success rate; in the Robosuite-Kitchen environment, it surpasses SiMPL by 2.4%; and it achieves an 80% success rate on the UR5 physical robot platform in table-top rearrangement tasks, demonstrating good accuracy and realistic adaptability in the manipulation tasks. The adaptability of action policy learning to task changes in long-horizon tasks is significantly enhanced by the proposed method, providing an effective paradigm for long-horizon robot planning based on diffusion modeling.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return