基于点线特征的单目视觉同时定位与地图构建算法

A Monocular Visual SLAM Algorithm Based on Point-Line Feature

  • 摘要: 当相机快速运动导致图像模糊或场景中纹理缺失时,基于点特征的同时定位与地图构建(SLAM)算法难以追踪足够多的有效特征点,定位精度和鲁棒性较差,甚至无法正常工作.为此,设计了一种基于点、线特征并融合轮式里程计数据的单目视觉同时定位与地图构建算法.首先,利用点特征与线特征的互补来提高数据关联的准确性,并据此构建具有几何信息的环境特征地图,同时引入轮式里程计数据为视觉定位算法提供先验和尺度信息.然后,通过最小化局部地图点、线的重投影误差得到更准确的视觉位姿,在视觉定位失效时,定位系统能根据轮式里程计数据继续工作.通过对比在多组公开数据集上得到的仿真实验结果可知,本文算法性能优于MSCKF(multi-state constraint Kalman filter)和LSD-SLAM(large-scale direct monocular SLAM)算法,证明了该算法的准确性和有效性.最后,将该算法应用于课题组搭建的机器人系统上,得到单目视觉定位均方误差(RMSE)约为7 cm,在1.2 GHz主频、四核处理器的嵌入式平台上平均每帧(640×480)的处理时间约为90 ms.

     

    Abstract: In the circumstances with blurred images owing to the fast movement of camera or in low-textured scenes, it is difficult for the SLAM (simultaneous localization and mapping) algorithm based on point features to track sufficient effective point features, which leads to low accuracy and robustness, and even causes the system can't work normally. For this problem, a monocular visual SLAM algorithm based on the point and line features and the wheel odometer data is designed. Firstly, the data association accuracy is improved by using the complementation of point-feature and line-feature. Based on this, an environmental-feature map with geometric information is constructed, and meanwhile the wheel odometer data is incorporated to provide prior and scale information for the visual localization algorithm. Then, the more accurate visual pose is estimated by minimizing reprojection errors of points and line segments in the local map. When the visual localization fails, the localization system still works normally with the wheel odometer data. The simulation results on various public datasets show that the proposed algorithm outperforms the multi-state constraint Kalman filter (MSCKF) algorithm and large-scale direct monocular SLAM (LSD-SLAM) algorithm, which demonstrates the correctness and effectiveness of the algorithm. Finally, the algorithms is applied to a self-developed physical robot system. The root mean square error (RMSE) of the monocular visual localization algorithm is about 7 cm, and the processing time is about 90 ms per frame (640×480) on the embedded platform with 4-core processor of 1.2 GHz main frequency.

     

/

返回文章
返回