Abstract：Most of the existing SLAM (simultaneous localization and mapping) methods assume that the environment is static, and the moving objects in the scene are considered as interference, which will lead to a reduced accuracy of positioning and mapping or even failure. Although the detection and tracking of moving objects are necessary in many applications, they are ignored when solving SLAM problems. For this problem, a method combining LiDAR and IMU (inertial measurement unit) is proposed to perform SLAM and detection and tracking of moving objects simultaneously. Firstly, the motion distortion caused by LiDAR motion in the scanning process is compensated by the inertial sensor. All possible moving targets are detected by the FCNN (full convolutional neural network) based on the point cloud after motion compensation. By UKF (unscented Kalman filter), the moving targets are tracked, and the static and dynamic targets are distinguished. Then the remaining static background point clouds are applied to data association and motion estimation to realize positioning and mapping. To further improve the accuracy and consistence of mapping results, the closed-loop detection is integrated, and the global optimization of the trajectory and map is realized based on graph optimization method. Many experiments are carried out on the open dataset KITTI and the dataset collected on the self-developed experimental platform. The experimental results show that, compared with the traditional SLAM methods, the proposed method can not only effectively detect and track moving objects, but also complete vehicle pose estimation and map building in a real-time, robust, low-drift manner, and the mapping accuracy is significantly better than other existing methods.
 Davison A J, Reid I D, Molton N D, et al. MonoSLAM: Real-time single camera SLAM[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 1052-1067.  Zhang J, Singh S. Laser-visual-inertial odometry and mapping with high robustness and low drift[J]. Journal of Field Robotics, 2018, 35(8): 1242-1264.  Gálvez-López D, Salas M, Tardós J D, et al. Real-time monocular object SLAM[J]. Robotics and Autonomous Systems, 2016, 75(B): 435-449.  Mur-Artal R, Tardos J D. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255-1262.  Pegden C D, Pritsker A A B. SLAM tutorial[C]//Proceedings of the 12th Conference on Winter Simulation. Piscataway, USA: IEEE, 1980: 347-356.  O'Reilly J J, Whitford J P. SLAM II tutorial[C]//Proceedings of the 20th Conference on Winter Simulation. Piscataway, USA: IEEE, 1988: 85-90 .  Cadena C, Carlone L, Carrillo H, et al. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32(6): 1309-1332.  Lu Z Y, Hu Z C, Uchimura K. SLAM estimation in dynamic outdoor environments: A review[M]//Lecture Notes in Computer Science, Vol.5928. Berlin, Germany: Springer, 2009: 255-267.  Walcott-Bryant A, Kaess M, Johannsson H, et al. Dynamic pose graph SLAM: Long-term mapping in low dynamic environments[C]//IEEE/RSJ International Conference on Intelli-gent Robots and Systems. Piscataway, USA: IEEE, 2012: 1871-1878.  Wang Z L, Chen Y, Mei Y, et al. IMU-assisted 2D SLAM method for low-texture and dynamic environments[J]. Applied Sciences, 2018, 8(12). DOI: 10.3390/app8122534.  Pomerleau F, Krüsi P, Colas F, et al. Long-term 3D map maintenance in dynamic environments[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2014. DOI: 10.1109/ICRA.2014.6907397.  Wang R D, Xu Y C, Sotelo M A, et al. A robust registration method for autonomous driving pose estimation in urban dynamic environment using LiDAR[J]. Electronics, 2019, 8(1). DOI: 10.3390/electronics8010043.  Bahraini M S, Rad A B, Bozorg M. SLAM in dynamic environments: A deep learning approach for moving object tracking using ML-RANSAC algorithm[J]. Sensors, 2019, 19(17). DOI: 10.3390/s19173699.  Jiang C S, Paudel D P, Fougerolle Y, et al. Static-map and dynamic object reconstruction in outdoor scenes using 3D motion segmentation[J]. IEEE Robotics and Automation Letters, 2016, 1(1): 324-331.  Chen X Y L, Milioto A, Palazzolo E, et al. SuMa++: Efficient LiDAR-based semantic SLAM[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA: IEEE, 2019: 4530-4537.  Thakoor N, Gao J, Devarajan V. Multibody structure-and-motion segmentation by branch-and-bound model selection[J]. IEEE Transactions on Image Processing, 2010, 19(6): 1393-1402.  Chen J H, Yang J. Robust subspace segmentation via low-rank representation[J]. IEEE Transactions on Cybernetics, 2014, 44(8): 1432-1445.  Vidal R, Ma Y, Soatto S, et al. Two-view multibody structure from motion[J]. International Journal of Computer Vision, 2006, 68: 7-25.  Liang M, Yang B, Wang S L, et al. Deep continuous fusion for multi-sensor 3D object detection[M]//Lecture Notes in Computer Science, Vol.11220. Cham, Switzerland: Springer, 2018: 663-678.  Tatarchenko M, Park J, Koltun V, et al. Tangent convolutions for dense prediction in 3D[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2018: 3887-3896.  Wu B C, Wan A, Yue X Y, et al. SqueezeSeg: Convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA: IEEE, 2018: 1887-1893.  Milioto A, Vizzo I, Behley J, et al. RangeNet++: Fast and accurate LiDAR semantic segmentation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA: IEEE, 2019: 4213-4220.  Behley J, Garbade M, Milioto A, et al. SemanticKITTI: A dataset for semantic scene understanding of LiDAR sequences[C]//IEEE/CVF International Conference on Computer Vision. Piscataway, USA: IEEE, 2019: 9296-9306.  Wang L, Huang Y C, Hou Y L, et al. Graph attention convolution for point cloud semantic segmentation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2019: 10288-10297.  Charles R Q, Su H, Kaichun M, et al. PointNet: Deep learning on point sets for 3D classification and segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2017: 77-85.  Kundu A, Krishna K M, Jawahar C V. Realtime multibody visual SLAM with a smoothly moving monocular camera[C]//International Conference on Computer Vision. Piscataway, USA: IEEE, 2011: 2080-2087.  Schreier M, Willert V, Adamy J. Compact representation of dynamic driving environments for ADAS by parametric free space and dynamic object maps[J]. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(2): 367-384.  Blackman S S. Multiple hypothesis tracking for multiple target tracking[J]. IEEE Aerospace and Electronic Systems Magazine, 2004, 19(1): 5-18.  Wang D Z, Posner I, Newman P. Model-free detection and trac-king of dynamic objects with 2D LiDAR[J]. International Journal of Robotics Research, 2015, 34(7): 1039-1063.  Wang C C, Thorpe C, Thrun S, et al. Simultaneous localization, mapping and moving object tracking[J]. International Journal of Robotics Research, 2007, 26(9): 889-916.  Migliore D, Rigamont R, Marzorati D, et al. Use a single camera for simultaneous localization and mapping with mobile object tracking in dynamic environments[A/OL]. [2020-06-03]. http://www.rawseeds.org/home/wp-content/uploads/2009/10/Dynamic_SLAM_Workshop_ICRA_2009.pdf.  Lin K H, Wang C C. Stereo-based simultaneous localization, mapping and moving object tracking[C]//IEEE/RSJ Interna-tional Conference on Intelligent Robots and Systems. Piscataway, USA: IEEE, 2010: 3975-3980.  Li P L, Qin T, Shen S J. Stereo vision-based semantic 3D object and ego-motion tracking for autonomous driving[M]//Lecture Notes in Computer Science, Vol.11206. Berlin, Germany: Springer, 2018: 664-679.  Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2015: 3431-3440.  Rachman A S A. 3D-LiDAR multi object tracking for autonomous driving[D]. Delft, Netherlands: Delft University of Technology, 2017.  Geiger A, Lenz P, Stiller C, et al. Vision meets robotics: The KITTI dataset[J]. International Journal of Robotics Research, 2013, 32(11): 1231-1237.  Schreier M. Bayesian environment representation, prediction, and criticality assessment for driver assistance systems[J]. Automatisierungstechnik, 2017, 65(2): 151-152.  Bar-Shalom Y, Daum F, Huang J. The probabilistic data association filter[J]. IEEE Control Systems Magazine, 2009, 29(6): 82-100.  Shan T, Englot B. LeGO-LOAM: Lightweight and ground-optimized LiDAR odometry and mapping on variable terrain[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA: IEEE, 2018: 4758-4765.  Behley J, Stachniss C. Efficient surfel-based SLAM using 3D laser range data in urban environments[C]//Robotics: Science and Systems. Pittsburgh, PA, USA: MIT Press, 2018.  Keni B, Rainer S. Evaluating multiple object tracking performance: The CLEAR MOT metrics[J]. EURASIP Journal on Image and Video Processing, 2008. DOI: 10.1155/2008/246309.