Improving RGB-D SLAM in dynamic environments using semantic aided segmentation

Lhilo Kenye; Rahul Kala

doi:10.1017/S0263574721001521

Improving RGB-D SLAM in dynamic environments using semantic aided segmentation

Published online by Cambridge University Press: 16 November 2021

Lhilo Kenye

and

Rahul Kala

Show author details

Lhilo Kenye*: Affiliation:
Centre of Intelligent Robotics, Indian Institute of Information Technology, Allahabad, Prayagraj, India NavAjna Technologies Pvt. Ltd., Hyderabad, India
Rahul Kala: Affiliation:
Centre of Intelligent Robotics, Indian Institute of Information Technology, Allahabad, Prayagraj, India
*: *Corresponding author. E-mail: lkenye02@gmail.com

Article contents

Summary
References

Get access

Rights & Permissions

Summary

Most conventional simultaneous localization and mapping (SLAM) approaches assume the working environment to be static. In a highly dynamic environment, this assumption divulges the impediments of a SLAM algorithm that lack modules that distinctively attend to dynamic objects despite the inclusion of optimization techniques. This work exploits such environments and reduces the effects of dynamic objects in a SLAM algorithm by separating features belonging to dynamic objects and static background using a generated binary mask image. While the features belonging to the static region are used for performing SLAM, the features belonging to non-static segments are reused instead of being eliminated. The approach employs deep neural network or DNN-based object detection module to obtain bounding boxes and then generates a lower resolution binary mask image using depth-first search algorithm over the detected semantics, characterizing the segmentation of the foreground from the static background. In addition, the features belonging to dynamic objects are tracked into consecutive frames to obtain better masking consistency. The proposed approach is tested on both publicly available dataset as well as self-collected dataset, which includes both indoor and outdoor environments. The experimental results show that the removal of features belonging to dynamic objects for a SLAM algorithm can significantly improve the overall output in a dynamic scene.

Keywords

simultaneous localization and mapping object recognition dynamic SLAM background detection dynamic object filtering computer vision

Type: Research Article
Information: Robotica , Volume 40 , Issue 6 , June 2022 , pp. 2065 - 2090

DOI: https://doi.org/10.1017/S0263574721001521 [Opens in a new window]
Copyright: © The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Cadena, C., Carlone, L., Carrillo, H., et al. “Past, present, and future of simultaneous localization and mapping: toward the robust-perception age,” IEEE Trans. Rob. 32(6), 1309–1332 (2016).CrossRef Google Scholar

Sturm, J., Engelhard, N., Endres, F, Burgard, W. and Cremers, D., “A benchmark for the evaluation of RGB-D SLAM systems,” Paper Presented at: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems 7–12 Oct. 2012 (2012).CrossRef Google Scholar

Dai, W., Zhang, Y., Li, P., Fang, Z. and Scherer, S., “RGB-D SLAM in dyna mic environments using point correlations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2020.3010942.Google Scholar

Strasdat, H., Montiel, J. M. M. and Davison, A. J., “Visual SLAM: Why filter?,” Image Vis. Comput. 30(2), 65–77 (2012).CrossRef Google Scholar

Engel, J., Schöps, T. and Cremers, D., “LSD-SLAM: Large-Scale Direct Monocular SLAM,” Paper presented at: Computer Vision – ECCV (2014).CrossRef Google Scholar

Engel, J., Stückler, J. and Cremers, D., “Large-scale direct SLAM with stereo cameras,” Paper presented at: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 28 Sept.–2 October 2015 (2015).CrossRef Google Scholar

Mur-Artal, R., Montiel, J. M. M. and Tardós, J. D., “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans. Rob. 31(5), 1147–1163 (2015).CrossRef Google Scholar

Mur-Artal, R. and Tardós, J. D. “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,” IEEE Trans. Rob. 33(5), 1255–1262 (2017).CrossRef Google Scholar

Pire, T., Fischer, T., Civera, J., Cristóforis, P. D. and Berlles, J. J., “Stereo parallel tracking and mapping for robot localization,” Paper presented at: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 28 Sept.–2 October 2015 (2015).CrossRef Google Scholar

Lowe, D. G., “Object recognition from local scale-invariant features,” Paper presented at: Proceedings of the Seventh IEEE International Conference on Computer Vision; 20–27 Sept. 1999 (1999).CrossRef Google Scholar

Bay, H., Ess, A., Tuytelaars, T. and Van Gool, L., “Speeded-up robust features (SURF),” Comput. Vis. Image Underst. 110(3), 346–359 (2008).CrossRef Google Scholar

Calonder, M., Lepetit, V., Strecha, C. and Fua, P., “BRIEF: Binary robust independent elementary features,” Paper presented at: Computer Vision – ECCV (Berlin, Heidelberg, 2010).Google Scholar

Alcantarilla, P. F., Bartoli, A. and Davison, A. J., “KAZE Features,” Paper presented at: Computer Vision – ECCV (Berlin, Heidelberg, 2012).Google Scholar

Rosten, E. and Drummond, T., “Machine learning for high-speed corner detection,” Paper presented at: Computer Vision – ECCV (Berlin, Heidelberg, 2006).Google Scholar

Rublee, E., Rabaud, V., Konolige, K. and Bradski, G., “ORB: An efficient alternative to SIFT or SURF,” Paper presented at: 2011 International Conference on Computer Vision; 6–13 Nov. 2011 (2011).CrossRef Google Scholar

Huletski, A., Kartashov, D. and Krinkin, K., “Evaluation of the modern visual SLAM methods,” Paper presented at: 2015 Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT); 9–14 Nov. 2015, (2015).Google Scholar

Churchill, W. and Newman, P., “Practice makes perfect? Managing and leveraging visual experiences for lifelong navigation,” Paper presented at: 2012 IEEE International Conference on Robotics and Automation; 14–18 May 2012 (2012).CrossRef Google Scholar

Churchill, W. and Newman, P., “Continually improving large scale long term visual navigation of a vehicle in dynamic urban environments,” Paper presented at: 2012 15th International IEEE Conference on Intelligent Transportation Systems; 16–19 Sept. 2012 (2012).CrossRef Google Scholar

Churchill, W. and Newman, P., “Experience-based na vigation for long-term localization,” Int. J. Robot. Res. 32(14), 1645–1661 (2013).CrossRef Google Scholar

Linegar, C., Churchill, W. and Newman, P., “Work smart, not hard: Recalling relevant experiences for vast-scale but time-constrained localization,” Paper presented at: 2015 IEEE International Conference on Robotics and Automation (ICRA); 26–30 May 2015 (2015).CrossRef Google Scholar

Zheng, Shuai, Jayasumana, Sadeep, Romera-Paredes, Bernardino, Vineet, Vibhav, Su, Zhizhong, Du, Dalong, Huang, Chang, Torr, Philip H. S., “Conditional random fields as recurrent neural networks,” Paper presented at: 2015 IEEE International Conference on Computer Vision (ICCV); 7–13 Dec. 2015 (2015).CrossRef Google Scholar

Saputra, M. R. U., Markham, A. and Trigoni, N., “Visual SLAM and structure from motion in dynamic environments: A survey,” ACM Comput. Surv. 51(2) Article 37 (2018).Google Scholar

Parra, I., Sotelo, M. A. and Vlacic, L., “Robust visual odometry for complex urban environments,” Paper presented at: 2008 IEEE Intelligent Vehicles Symposium; 4–6 June 2008 (2008).CrossRef Google Scholar

Kitt, B., Moosmann, F. and Stiller, C., “Moving on to dynamic environments: Visual odometry using feature classification,” Paper presented at: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems; 18–22 Oct. 2010 (2010).CrossRef Google Scholar

Zou, D. and Tan, P., “CoSLAM: Collaborative visual SLAM in dynamic environments,” IEEE Trans. Pattern Anal. Mach. Intell. 35(2), 354–366 (2013).CrossRef Google Scholar PubMed

Azartash, H., Lee, K. and Nguyen, T. Q., “Visual odometry for RGB-D cameras for dynamic scenes,” Paper presented at: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 4–9 May 2014 (2014).CrossRef Google Scholar

Felzenszwalb, P. F. and Huttenlocher, D. P., “Efficient graph-based image segmentation,” Int. J. Comput. Vis. 59(2), 167–181 (2004).CrossRef Google Scholar

An, L., Zhang, X., Gao, H. and Liu, Y., “Semantic segmentation–aided visual odometry for urban autonomous driving,” Int. J. Adv. Rob. Syst. 14(5) (2017). doi: 10.1177/1729881417735667.Google Scholar

Lee, S., Son, C. Y. and Kim, H. J., “Robust real-time RGB-D visual odometry in dynamic environments via rigid motion model,” Paper presented at: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); 3–8 Nov. 2019 (2019).CrossRef Google Scholar

Sun, Y., Liu, M. and Meng, M. Q. H., “Improving RGB-D SLAM in dynamic environments: A motion removal approach,” Rob. Auton. Syst. 89, 110–122 (2017).CrossRef Google Scholar

Sun, Y, Liu, M and Meng, M. Q. H., “Motion removal for reliable RGB-D SLAM in dynamic environments,” Rob. Auton. Syst. 108, 115–128 (2018).CrossRef Google Scholar

Zhang, Y., Dai, W., Peng, Z., Li, P. and Fang, Z., “Feature regions segmentation based rgb-d visual odometry in dynamic environment,” Paper presented at: IECON 2018 – 44th Annual Conference of the IEEE Industrial Electronics Society; 21–23 Oct. 2018 (2018).CrossRef Google Scholar

Barber, C. B., Dobkin, D. P. and Huhdanpaa, H., “The quickhull algorithm for convex hulls,” J ACM Trans. Math. Softw. 22(4), 469–483 (1996).CrossRef Google Scholar

Scona, R., Jaimez, M., Petillot, Y. R., Fallon, M. and Cremers, D., “StaticFusion: Background reconstruction for dense RGB-D SLAM in dynamic environments,” Paper presented at: 2018 IEEE International Conference on Robotics and Automation (ICRA); 21–25 May 2018 (2018).CrossRef Google Scholar

Cheng, J., Sun, Y. and Meng, M. Q. H., “Robust semantic mapping in challenging environments,” Robotica 38(2), 256–270 (2020).CrossRef Google Scholar

Cheng, J., Wang, C. and Meng, M. Q., “Robust visual localization in dynamic environments based on sparse motion removal,” IEEE Trans. Autom. Sci. Eng. 17(2), 658–669 (2020).CrossRef Google Scholar

Cheng, J., Zhang, H. and Meng, M. Q., “Improving visual localization accuracy in dynamic environments based on dynamic region removal,” IEEE Trans. Autom. Sci. Eng. 17(3), 1585–1596 (2020).CrossRef Google Scholar

Zhao, Z., Zheng, P., Xu, S. and Wu, X., “Object detection with deep learning: A review,” IEEE Trans. Neural Networks Learn. Syst. 30(11), 3212–3232 (2019).CrossRef Google Scholar PubMed

Lucas, B. D. and Kanade, T., “An iterative image registration technique with an application to stereo vision,” Proceedings of the 7th international joint conference on Artificial intelligence – Volume 2 (Vancouver, BC, Canada, 1981).Google Scholar

Shi, Jianbo and Tomasi, , “Good features to track,” 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1994, pp. 593-600, doi: 10.1109/CVPR.1994.323794.CrossRef Google Scholar

Huang, Jonathan, Rathod, Vivek, Sun, Chen, Zhu, Menglong, Korattikara, Anoop, Fathi, Alireza, Fischer, Ian, Wojna, Zbigniew, Song, Yang, Guadarrama, Sergio, Murphy, Kevin, “Speed/accuracy trade-offs for modern convolutional object detectors,” Paper presented at: Proceedings of the IEEE conference on computer vision and pattern recognition 2017 (2017).CrossRef Google Scholar

Labbé, M. and Michaud, F., “RTAB-Map as an open-source lida r and visual simultaneous localization and mapping library for large-scale and long-term online operation,” J. Field Robot. 36(2), 416–446 (2019).CrossRef Google Scholar

Article contents

Improving RGB-D SLAM in dynamic environments using semantic aided segmentation

Summary

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests