赵洋, 刘国良, 田国会, 罗勇, 王梓任, 张威, 李军伟. 基于深度学习的视觉SLAM综述[J]. 机器人, 2017, 39(6): 889-896.DOI: 10.13973/j.cnki.robot.2017.0889.
ZHAO Yang, LIU Guoliang, TIAN Guohui, LUO Yong, WANG Ziren, ZHANG Wei, LI Junwei. A Survey of Visual SLAM Based on Deep Learning. ROBOT, 2017, 39(6): 889-896. DOI: 10.13973/j.cnki.robot.2017.0889.
Abstract:Latest research progresses of deep learning techniques applied to SLAM (simultaneous localization and mapping) are summarized. In addition, the prominent achievements on inter-frame motion estimation, loop closure detection and semantic SLAM incorporated with deep learning are introduced. Furthermore, the deep learning based SLAM is compared with the traditional ones in detail. Finally, the future research directions of advanced SLAM based on deep learning are discussed.
[1] Fuentes-Pacheco J, Ruiz-Ascencio J, Rendón-Mancha J M, et al. Visual simultaneous localization and mapping:A survey[J]. Artificial Intelligence Review, 2015, 43(1):55-81.
[2] 徐德.室内移动式服务机器人的感知、定位与控制[M].北京:科学出版社,2008.Xu D. Perception, localization and control of indoor mobile service robot[M]. Beijing:Science Press, 2008.
[3] Cadena C, Carlone L, Carrillo H, et al. Past, present, and future of simultaneous localization and mapping:Towards the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32(6):1309-1332.
[4] 顾照鹏,刘宏.单目视觉同步定位与地图创建方法综述[J].智能系统学报,2015,10(4):499-507.Gu Z P, Liu H. A survey of monocular simultaneous localization and mapping[J]. CAAI Transactions on Intelligent Systems, 2015, 10(4):499-507.
[5] 梁明杰,闵华清,罗荣华.基于图优化的同时定位与地图创建综述[J].机器人,2013,35(4):500-512.Liang M J, Min H Q, Luo R. Graph-based SLAM:A survey[J]. Robot, 2013, 35(4):500-512.
[6] Kummerle R, Grisetti G, Strasdat H, et al. g2o:A general framework for graph optimization[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2011:3607-3613.
[7] 徐德,谭民,李原.机器人视觉测量与控制[M].北京:国防工业出版社,2011.Xu D, Tan M, Li Y. Visual measurement and control for robots[M]. Beijing:National Defense Industry Press, 2011.
[8] 杨东方,王仕成,刘华平,等.基于Kinect系统的场景建模与机器人自主导航[J].机器人,2012,34(5):581-589.Yang D F, Wang S C, Liu H P, et al. Scene modeling and autonomous navigation for robots based on Kinect system[J]. Robot, 2012, 34(5):581-589.
[9] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2012:3354-3361.
[10] Klein G, Murray D. Parallel tracking and mapping on a camera phone[C]//IEEE International Symposium on Mixed and Augmented Reality. Piscataway, USA:IEEE, 2009:83-86.
[11] 陈殿生,刘静华,殷兰兰.服务机器人辅助老年人生活的新模式与必要性[J].机器人技术与应用,2010,17(2):2-4.Chen D S, Liu J H, Yin L L. New style and necessity of service robot assisting the elderly[J]. Robot Technique and Application, 2010, 17(2):2-4.
[12] 张建伟,张立新,胡颖,等.开源机器人操作系统——ROS[M].北京:科学出版社,2012.Zhang J W, Zhang L Y, Hu Y, et al. ROS:Open source robot operate system[M]. Beijing:Sicence Press, 2012.
[13] Davison A J. Real-time simultaneous localisation and mapping with a single camera[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2003:1403.
[14] Mur-Artal R, Montiel J M M, Tardós J D. ORB-SLAM:A versatile and accurate monocular SLAM System[J]. IEEE Transactions on Robotics, 2015, 31(5):1147-1163.
[15] Konda K, Memisevic R. Learning visual odometry with a convolutional network[C]//Proceedings of the 10th International Conference on Computer Vision Theory and Applications. Lisbon, Portugal:SCITCC Press, 2015:486-490.
[16] Dosovitskiy A, Fischery P, Ilg E, et al. FlowNet:learning optical flow with convolutional networks[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2015:2758-2766.
[17] Handa A, Bloesch M, P?tr?ucean V, et al. gvnn:Neural network library for geometric computer vision[M]//Lecture Notes in Computer Science, vol. 9915. Berlin, Germany:Springer-Verlag, 2016:67-82.
[18] Costante G, Mancini M, Valigi P, et al. Exploring representation learning with CNNs for frame-to-frame ego-motion estimation[J]. IEEE Robotics and Automation Letters, 2016, 1(1):18-25.
[19] Bai D D, Wang C Q, Zhang B. Matching-range-constrained real-time loop closure detection with CNNs features[J]. Robotics and Biomimetics, 2016, 3(1):70-75.
[20] Shahid M, Naseer T, Burgard W. DTLC:Deeply trained loop closure detections for lifelong visual SLAM[C/OL]//Robotics:Science and Systems. (2016-06-18)[2016-11-10]. https://roboticvision.atlassian.net/wiki/download/attachments/41320632/Shahid%20-%20DTLC.pdf?version=1&modificationDate=1466185006962&cacheVersion=1&api=v2.
[21] Gao X, Zhang T. Loop closure detection for visual slam systems using deep neural networks[C]//34th Chinese Control Conference. Piscataway, USA:IEEE, 2015:5851-5856.
[22] Gao X, Zhang T. Unsupervised learning to detect loops using deep neural networks for visual SLAM system[J]. Autonomous Robots, 2015, 41(1):1-18.
[23] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2014:580-587.
[24] Fischer P, Dosovitskiy A, Brox T. Descriptor matching with convolutional neural networks:A comparison to SIFT[EB/OL]. (2015-06-24)[2016-11-10]. https://arxiv.org/pdf/1405.5769.pdf.
[25] 伍锡如,黄国明,孙立宁.基于深度学习的工业分拣机器人快速视觉识别与定位算法[J].机器人,2016,38(6):711-719. Wu X R, Huang G M, Sun L N. Fast visual identification and location algorithm for industrial sorting robots based on deep learning[J]. Robot, 2016, 38(6):711-719.
[26] 牛杰,卜雄洙,钱堃,等.一种融合全局及显著性区域特征的室内场景识别方法[J].机器人,2015,37(1):122-128.Niu J, Bu X Z, Qian K, et al. An indoor scene recognition method combining global and saliency region features[J]. Robot, 2015, 37(1):122-128.
[27] Salas-Moreno, Renato F. Dense semantic SLAM[D]. London, UK:Imperial College, 2014.
[28] McCormac J, Handa A, Davison A, et al. SemanticFusion:Dense 3D semantic mapping with convolutional neural networks[EB/OL]. (2016-08-28)[2016-11-10]. https://arxiv.org/pdf/1609.05130.pdf.
[29] Tenorth M, Kunze L, Jain D, et al. Knowrob-map-knowledge-linked semantic object maps[C]//IEEE/RAS International Conference on Humanoid Robots. Piscataway, USA:IEEE, 2010:430-435.
[30] Kitt B, Geiger A, Lategahn H. Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme[C]//Intelligent Vehicles Symposium. Piscataway, USA:IEEE, 2010:486-492.
[31] Konda K, Memisevic R. Unsupervised learning of depth and motion[EB/OL]. (2013-12-16)[2016-11-10]. https://arxiv. org/pdf/1312.3429.pdf.
[32] Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems. San Francisco, USA:Morgan Kaufmann, 2015:2017-2025.
[33] Chen Z T, Lam O, Jacobson A, et al. Convolutional neural network-based place recognition[EB/OL]. (2014-09-06)[2016-11-10]. https://arxiv.org/pdf/1411.1509.pdf.
[34] Cummins M, Newman P. FAB-MAP:Probabilistic localization and mapping in the space of appearance[J]. International Journal of Robotics Research, 2008, 27(6):647-665.
[35] Hou Y, Zhang H, Zhou S L. Convolutional neural network-based image representation for visual loop closure detection[C]//IEEE International Conference on Information and Automation. Piscataway, USA:IEEE, 2015:2238-2245.
[36] Sünderhauf N, Shirazi S, Jacobson A, et al. Place recognition with ConvNet landmarks:Viewpoint-robust, condition-robust, training-free[C/OL]//Robotics:Science and Systems. (2016-06-27)[2016-11-10]. http://www.roboticsproceedings.org/rss11/p22.pdf.
[37] Gomez-Ojeda R, Lopez-Antequera M, Petkov N, et al. Training a convolutional neural network for appearance-invariant place recognition[EB/OL]. (2015-3-27)[2016-11-10]. https://arxiv.org/pdf/1505.07428.pdf.
[38] Sunderhauf N, Shirazi S, Dayoub F, et al. On the performance of ConvNet features for place recognition[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2015:4297-4304.
[39] Arandjelovic R, Gronat P, Torii A, et al. NetVLAD:CNN architecture for weakly supervised place recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2016:5297-5307.
[40] 于金山,吴皓,田国会,等.基于云的语义库设计及机器人语义地图构建[J].机器人,2016,38(4):410-419.Yu J S, Wu H, Tian G H, et al. Semantic database design and semantic map construction of robots based on the cloud[J]. Robot, 2016, 38(4):410-419.
[41] Tenorth M, Kunze L, Jain D, et al. Knowrob-map-knowledge-linked semantic object maps[C]//IEEE/RAS International Conference on Humanoid Robots. Piscataway, USA:IEEE, 2010:430-435.
[42] Sünderhauf N, Pham T, Latif Y, et al. Meaningful maps-Object-oriented semantic mapping[EB/OL]. (2016-9-26)[2016-11-10]. https://arxiv.org/pdf/1609.07849.pdf.
[43] Mur-Artal R, Tardos J D. Probabilistic semi-dense mapping from highly accurate feature-based monocular SLAM[C]//Robotics:Science and Systems. Cambridge, USA:MIT Press, 2015.
[44] Liu W, Anguelov D, Erhan D, et al. SSD:Single shot multibox detector[M]//Lecture Notes in Computer Science, vol.9905. Berlin, Germany:Springer-Verlag, 2016:21-37.
[45] Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation[C]//IEEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2015:1520-1528.
[46] Li X, Belaroussi R. Semi-dense 3D semantic mapping from monocular SLAM[EB/OL]. (2016-11-13)[2016-12-10].https://arxiv.org/padf/1611.04144.pdf.
[47] Masci J, Meier U, An D, et al. Stacked convolutional auto-encoders for hierarchical feature extraction[M]//Lecture Notes in Computer Science, vol.6791. Berlin, Germany:Springer-Verlag, 2011:52-59.
[48] Tong C H, Anderson S, Dong H, et al. Pose interpolation for laser-based visual odometry[J]. Journal of Field Robotics, 2014, 31(5):787-813.
[49] Nicolai A, Skeele R, Eriksen C, et al. Deep learning for laser based odometry estimation[EB/OL]. (2016-6-17)[2016-11-10]. http://juxi.net/workshop/deep-learning-rss-2016/papers/Nicolai%20-%20Deep%20Learning%20Lidar%20Odometry.pdf.
[50] Lim G H, Suh I H, Suh H. Ontology-based unified robot knowledge for service robots in indoor environments[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part A:Systems and Humans, 2011, 41(3):492-509.
[51] Li C C, Tian G H, Chen H Z. The introduction of ontology model based on SSO design pattern to the intelligent space for home service robots[C]//IEEE International Conference on Robotics and Biomimetics. Piscataway, USA:IEEE, 2016.
[52] Waibel M, Beetz M, Civera J, et al. RoboEarth[J]. IEEE Robotics & Automation Magazine, 2011, 18(2):69-82.
[53] Kehoe B, Patil S, Abbeel P, et al. A survey of research on cloud robotics and automation[J]. IEEE Transactions on Automation Science and Engineering, 2015, 12(2):1-12.
[54] Tian G H, Chen H Z, Lu F. Cloud computing platform based on intelligent space for service robot[C]//IEEE International Conference on Information and Automation. Piscataway, USA:IEEE, 2015:1562-1566.
[55] 田国会,许亚雄.云机器人:概念、架构与关键技术研究综述[J].山东大学学报:工学版,2014,44(6):47-54.Tian G H, Xu Y X. Cloud robotics:Concept, architectures and key technologies[J]. Journal of Shandong University:Engineering Science, 2014, 44(6):47-54.
[56] 林辉灿,吕强,张洋,等.稀疏和稠密的VSLAM的研究进展[J].机器人,2016,38(5):621-631.Lin H C, Lü Q, Zhang Y, et al. The sparse and dense VSLAM:A survey[J]. Robot, 2016, 38(5):621-631.
[57] Agrawal P, Carreira J, Malik J. Learning to see by moving[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2015:37-45.