Abstract:
Traditional robot imitation learning methods generally suffer from poor imitation success rates and severe reliance on quantity of expert samples, which is unsuitable for the highly time-varying and unstructured nursing scenarios.To solve the above problems, TGOD-SD, a imitation learning method based on oriented diversity, is proposed for nursing robots. Firstly, a TGOD(trajectory generation with oriented diversity) paradigm is constructed, which can be implemented in reinforcement learning based imitation learning approaches. TGOD can guide the agent to generate diverse imitation trajectories around the trajectory from expert demonstrations without constructing reward functions. Next, a trajectory matching method based on Sinkhorn distance(SD) is proposed, which benefits the agent to search for the best matching trajectory as the output of imitation learning. Finally, a sim-to-real transfer method is constructed based on joint angle to implement the imitated trajectory on the real nursing robot. A large number of imitation learning experiments on the nursing robot show that the proposed TGOD-SD method effectively improves the success rate of robot imitation learning, achieving an average improvement of 64.6% compared to the state-of-the-art(SOTA) methods; and the quality of successfully imitated trajectories is also promoted, with an average increase of 32.61% in the correlation coefficient with expert demonstration trajectories;additionally the expected time of successful imitation is reduced to 62.5% at least compared with SOTA methods. Principally,TGOD-SD accomplishes robot imitation learning from a single expert demonstration sample, which reduces the dependence on quantity of expert demonstration samples, and effectively improves the practicality of robot imitation learning.