HUANG Zhong, REN Fuji, HU Min, LIU Juan. Robotic Facial Emotion Transfer Network Based on Transformer Framework and B-spline Smoothing Constraint[J]. ROBOT, 2023, 45(4): 395-408. DOI: 10.13973/j.cnki.robot.220351
Citation: HUANG Zhong, REN Fuji, HU Min, LIU Juan. Robotic Facial Emotion Transfer Network Based on Transformer Framework and B-spline Smoothing Constraint[J]. ROBOT, 2023, 45(4): 395-408. DOI: 10.13973/j.cnki.robot.220351

Robotic Facial Emotion Transfer Network Based on Transformer Framework and B-spline Smoothing Constraint

  • To improve the spatial-temporal consistency of facial emotion transfer and reduce the influence of mechanical motion constraints for humanoid robot, a robotic facial emotion transformer (RFEFormer) network based on Transformer framework and B-spline smoothing constraint is proposed. The RFEFormer network consists of facial deformation encode subnet and actuation sequence generation subnet. In facial deformation encode subnet, an intra-frame spatial attention module, which is constructed based on dual mechanisms of intra-domain deformation attention and inter-domain cooperative attention, is embedded into Transformer encoder to represent the intra-frame spatial information of different levels and granularities. In actuation sequence generation subnet, a Transformer decoder, which accomplishes cross attention of facial spatio-temporal sequence and history motor actuation sequence, is addressed for multi-step prediction of future motor drive sequence. Moreover, a cubic B-spline smoothing constraint is introduced to realize the warping of prediction sequence. The experimental results show that the motor actuation deviation, the facial deformation fidelity and motor motion smoothness of the RFEFormer network is 3.21%, 89.48% and 90.63%, respectively. Furthermore, the frame rate of the real-time facial emotion transfer is greater than 25 frames per second. Compared with the related methods, the proposed RFEFormer network not only satisfies the real-time performance, but also improves the time sequence-based indexes such as fidelity and smoothness, which are more sensitive and concerned by human senses.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return