WANG Kesai1, YAO Xifan1, HUANG Yu1, LIU Min1, LU Yuqian2
1. School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640, China; 2. School of Engineering, University of Auckland, Auckland 1142, New Zealand
王柯赛, 姚锡凡, 黄宇, 刘敏, 陆玉前. 动态环境下的视觉SLAM研究评述[J]. 机器人, 2021, 43(6): 715-732.DOI: 10.13973/j.cnki.robot.200468.
WANG Kesai, YAO Xifan, HUANG Yu, LIU Min, LU Yuqian. Review of Visual SLAM in Dynamic Environment. ROBOT, 2021, 43(6): 715-732. DOI: 10.13973/j.cnki.robot.200468.
Abstract:For visual SLAM (simultaneous localization and mapping) systems in dynamic environments, the harmfulness of dynamic objects in the environment to the classic visual SLAM system is analyzed firstly, and the existing dynamic visual SLAM systems are divided into two categories according to whether the system is based on camera ego-motion or not. Then, the current research status of visual SLAM in dynamic environment is summarized and analyzed. Finally, the processing method and future trend of visual SLAM in dynamic environment are discussed and prospected.
[1] 刘浩敏,章国锋,鲍虎军. 基于单目视觉的同时定位与地图构建方法综述[J]. Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3):611-625. [9] Forster C, Pizzoli M, Scaramuzza D, et al. SVO:Fast semi-direct monocular visual odometry[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2014. DOI:10.1109/ICRA.2014.6906584. [10] Bonin-Font F, Ortiz A, Oliver G. 基于视觉的同时定位与地图构建方法综述[J].计算机应用研究,2010,27(8):2839-2844. He J X, Li Z M. Survey of vision-based approach to simultaneous localization and mapping[J]. Application Research of Computers, 2010, 27(8):2839-2844. [12] Chen S Y. Kalman filter for robot vision:A survey[J]. IEEE Transactions on Industrial Electronics, 2012, 59(11):4409-4420. [13] Cadena C, Carlone L, Carrillo H, et al. Past, present, and future of simultaneous localization and mapping:Toward the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32(6):1309-1332. [14] Yousif K, Bab-Hadiashar A, Hoseinnezhad R. SemanticFusion:Dense 3D semantic mapping with convolutional neural networks[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2017. DOI:10.1109/ICRA. 2017.7989538. [23] Qin T, Li P L, Shen S J. VINS-Mono:A robust and versatile monocular visual-inertial state estimator[J]. IEEE Transactions on Robotics, 2018, 34(4):1004-1020. [24] Xia L L, Cui J S, Shen R, et al. A survey of image semantics-based visual simultaneous localization and mapping:Application-oriented solutions to autonomous navigation of mobile robots[J]. International Journal of Advanced Robotic Systems, 2020, 17(3). DOI:10.1177/1729881420919185. [25] Sualeh M, Kim G W. Robust visual inertial odometry using a direct EKF-based approach[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2015. DOI:10.1109/IROS.2015. 7353389. [36] 葛振华,王纪凯,王鹏,等. 室内环境SLAM过程中动态目标的检测与消除[C]//第18届中国系统仿真技术及其应用学术年会.合肥:中国科学技术大学出版社,2017:268-274. Ge Z H, Wang J K, Wang P, et al. Detection and elimination of dynamic targets in indoor environment during the process of SLAM[C]//18th Annual Meeting of Chinese System Simulation Technology and Its Application. Hefei:University of Science and Technology of China Press, 2017:268-274. [37] Yang D S, Bi S S, Wang W, et al. DRE-SLAM:Dynamic RGB-D encoder SLAM for a differential-drive robot[J]. Remote Sensing, 2019, 11(4). DOI:10.3390/rs11040380. [38] Redmon J, Divvala S, Girshick R, et al. You only look once:Unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2016. DOI:10.1109/CVPR.2016.91. [39] Fischler M A, Bolles R C. Random sample consensus:A paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6). DOI:10.1145/358669.358692. [40] 杨世强,范国豪,白乐乐,等. 基于几何约束的室内动态环境视觉SLAM[J/OL].计算机工程与应用,(2020-08-23).https://kns.cnki.net/kcms/detail/11.2127.TP.20200821.1816.010.html. Yang S Q, Fan G H, Bai L L, et al. Geometric constraint-based visual SLAM under dynamic indoor environment[J/OL].Computer Engineering and Applications, (2020-08-23). https://kns.cnki.net/kcms/detail/11.2127.TP.20200821.1816.010.html. [41] Alcantarilla P F, Yebes J J, Almazan J, et al. On combining visual SLAM and dense scene flow to increase the robustness of localization and mapping in dynamic environments[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2012. DOI:10.1109/ICRA.2012.6224690. [42] 张慧娟,方灶军,杨桂林. 动态环境下基于线特征的RGB-D视觉里程计[J].机器人,2019,41(1):75-82. Zhang H J, Fang Z J, Yang G L. RGB-D visual odometry in dynamic environments using line features[J]. Robot, 2019, 41(1):75-82. [43] Sun Y X, Liu M, Meng M Q H. Improving RGB-D SLAM in dynamic environments:A motion removal approach[J]. Robotics and Autonomous Systems, 2017, 89:110-122. [44] 魏彤,李绪. 动态环境下基于动态区域剔除的双目视觉SLAM算法[J].机器人,2020,42(3):336-345. Wei T, Li X. Binocular visual SLAM algorithm based on dynamic region elimination in dynamic environment[J]. Robot, 2020, 42(3):336-345. [45] Sun Y X, Liu M, Meng M Q H. Motion removal for reliable RGB-D SLAM in dynamic environments[J]. Robotics and Autonomous Systems, 2018, 108:115-128. [46] Tan W, Liu H M, Dong Z L, et al. Robust monocular SLAM in dynamic environments[C]//IEEE International Symposium on Mixed and Augmented Reality. Piscataway, USA:IEEE, 2013. DOI:10.1109/ISMAR.2013.6671781. [47] Bahraini M S, Bozorg M, Rad A B. SLAM in dynamic environments via ML-RANSAC[J]. Mechatronics, 2018, 49:105-118. [48] Rousseeuw P J. Least median of squares regression[J]. Journal of the American Statistical Association, 1984, 79(388):871-880. [49] Lu X Y, Wang H, Tang S M, et al. DM-SLAM:Monocular SLAM in dynamic environments[J]. Applied Sciences, 2020, 10(12). DOI:10.3390/app10124252. [50] Wang Y B, Huang S D. Towards dense moving object segmentation based robust dense RGB-D SLAM in dynamic scenarios[C]//13th International Conference on Control, Automation, Robotics and Vision. Piscataway, USA:IEEE, 2014. DOI:10.1109/ICARCV.2014.7064596. [51] Kim D H, Kim J H. Effective background model-based RGB-D dense visual odometry in a dynamic environment[J]. IEEE Transactions on Robotics, 2016, 32(6):1565-1573. [52] Servant F, Marchand E, Houlier P, et al. Visual planes-based simultaneous localization and model refinement for augmented reality[C]//19th International Conference on Pattern Recognition. Piscataway, USA:IEEE, 2008. DOI:10.1109/ICPR.2008. 4761313. [53] Martinez-Carranza J, Calway A. Unifying planar and point mapping in monocular SLAM[C]//British Machine Vision Conference. Guildford, UK:BMVA, 2010. DOI:10.5244/C.24.43. [54] Taguchi Y, Jian Y D, Ramalingam S, et al. Point-plane SLAM for hand-held 3D sensors[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2013. DOI:10.1109/ICRA.2013.6631318. [55] Fernandez-Moral E, Mayol-Cuevas W, Arevalo V, et al. Fast place recognition with plane-based maps[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2013. DOI:10.1109/ICRA.2013.6630951. [56] Scona R, Jaimez M, Petillot Y R, et al. StaticFusion:Background reconstruction for dense RGB-D SLAM in dynamic environments[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2018. DOI:10.1109/ICRA.2018.8460681. [57] Guo Y M, Liu Y, Oerlemans A, et al. Deep learning for visual understanding:A review[J]. Neurocomputing, 2016, 187:27-48. [58] Sevak J S, Kapadia A D, Chavda J B, et al. Survey on semantic image segmentation techniques[C]//International Conference on Intelligent Sustainable Systems. Piscataway, USA:IEEE, 2017. DOI:10.1109/ISS1.2017.8389420. [59] Badrinarayanan V, Kendall A, Cipolla R. SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495. [60] He K M, Gkioxari G, Dollar P, et al. Mask R-CNN[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2017. DOI:10.1109/ICCV.2017.322. [61] Yu C, Liu Z X, Liu X J, et al. DS-SLAM:A semantic visual SLAM towards dynamic environments[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Picataway, USA:IEEE, 2018. DOI:10.1109/IROS.2018.8593691. [62] Hornung A, Wurm K M, Bennewitz M, et al. Monocular SLAM system in dynamic scenes based on semantic segmentation[M]//Lecture Notes in Computer Science, Vol.11903. Berlin, Germany:Springer, 2019:593-603. [82] Zhao L L, Liu Z L, Chen J W, et al. A compatible framework for RGB-D SLAM in dynamic scenes[J]. IEEE Access, 2019, 7:75604-75614. [83] Li P, Zhang G Q, Zhou J L, et al. Study on SLAM algorithm based on object detection in dynamic scene[C]//International Conference on Advanced Mechatronic Systems. Piscataway, USA:IEEE, 2019. DOI:10.1109/ICAMechS.2019.8861669. [84] Dai W C, Zhang Y, Li P, et al. RGB-D SLAM in dynamic environments using point correlations[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020. DOI:10.1109/TPAMI.2020.3010942. [85] Barber C B, Dobkin D P, Huhdanpaa H. The quickhull algorithm for convex hulls[J]. ACM Transactions on Mathematical Software, 1996, 22(4). DOI:10.1145/235815.235821. [86] Yang S Q, Fan G H, Bai L L, et al. MGC-VSLAM:A meshing-based and geometric constraint VSLAM for dynamic indoor environments[J]. IEEE Access, 2020, 8:81007-81021. [87] He J, Zhai Y W, Feng H B, et al. Dynamic objects detection based on stereo visual system in highly dynamic environment[C]//IEEE International Conference on Mechatronics and Automation. Piscataway, USA:IEEE, 2019. DOI:10.1109/ ICMA.2019.8816258. [88] Sun Y X, Liu M, Meng M Q H, et al. Invisibility:A moving-object removal approach for dynamic scene modelling using RGB-D camera[C]//IEEE International Conference on Robotics and Biomimetics. Piscataway, USA:IEEE, 2017. DOI:10.1109/ROBIO.2017.8324393. [89] Rother C, Kolmogorov V, Blake A. "GrabCut":Interactive foreground extraction using iterated graph cuts[J]. ACM Transactions on Graphics, 2004, 23(3):309-314. [90] Liu G H, Zeng W L, Feng B, et al. DMS-SLAM:A general visual SLAM system for dynamic scenes with multiple sensors[J]. Sensors, 2019, 19(17). DOI:10.3390/s19173714. [91] Bian J W, Lin W Y, Matsushita Y, et al. GMS:Grid-based motion statistics for fast, ultra-robust feature correspondence[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2017. DOI:10.1109/CVPR. 2017.302. [92] Zhao H S, Qi X J, Shen X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[M]//Lecture Notes in Computer Science, Vol.11207. Berlin, Germany:Springer, 2018:418-434. [93] 姚二亮,张合新,宋海涛,等. 基于语义信息和边缘一致性的鲁棒SLAM算法[J].机器人,2019,41(6):751-760. Yao E L, Zhang H X, Song H T, et al. Robust SLAM algorithm based on semantic information and edge consistency[J]. Robot, 2019, 41(6):751-760. [94] Yao E L, Zhang H X, Xu H, et al. Robust RGB-D visual odometry based on edges and points[J]. Robotics and Autonomous Systems, 2018, 107:209-220. [95] Zhang X Y, Tian Y, Jin Y C. A knee point-driven evolutionary algorithm for many-objective optimization[J]. IEEE Transactions on Evolutionary Computation, 2015, 19(6):761-776. [96] Cheng R, Jin Y C, Narukawa K, et al. A multiobjective evolutionary algorithm using Gaussian process-based inverse modeling[J]. IEEE Transactions on Evolutionary Computation, 2015, 19(6):838-856.