Hostname: page-component-8448b6f56d-c4f8m Total loading time: 0 Render date: 2024-04-23T19:23:29.678Z Has data issue: false hasContentIssue false

Picking out the Impurities: Attention-based Push-Grasping in Dense Clutter

Published online by Cambridge University Press:  25 March 2022

Ning Lu
Affiliation:
China Waterborne Transport Research Institute, Beijing, 100088, China
Yinghao Cai
Affiliation:
The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Tao Lu*
Affiliation:
The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Xiaoge Cao
Affiliation:
The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Weiyan Guo
Affiliation:
The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Shuo Wang
Affiliation:
The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, 200031, China
*
*Corresponding author. E-mail: tao.lu@ia.ac.cn

Abstract

“Picking out the impurities” is a typical scenario in production line which is both time consuming and laborious. In this article, we propose a target-oriented robotic push-grasping system which is able to actively discover and pick the impurities in dense environments with the synergies between pushing and grasping actions. First, we propose an attention module, which includes target saliency detection and density-based occluded-region inference. Without the necessity of expensive labeling of semantic segmentation, our attention module can quickly locate the targets in the view or predict the candidate regions where the targets are most likely to be occluded. Second, we propose a push–grasp synergy framework to sequentially select proper actions in different situations until all targets are picked out. Moreover, we introduce an active pushing mechanism based on a novel metric, namely Target-Centric Dispersion Degree (TCDD) for better grasping. TCDD describes whether the targets are isolated from the surrounding objects. With this metric, the robot becomes more focused on the actions around the targets and push irrelevant objects away. Experimental results on both simulated environment and real-world environment show that our proposed system outperforms several baseline approaches,which also has the capability to be generalized to new scenarios.

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Deng, Y., Guo, X. and Wei, Y., et al.,“Deep Reinforcement Learning for Robotic Pushing and Picking in Cluttered Environment,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2019), pp. 619626.Google Scholar
E. Jang, S. Vijayanarasimhan, P. Pastor, J. Ibarz and S. Levine, “End-to-End Learning of Semantic Grasping,” In: Proceedings of The 1st Conference on Robot Learning, CoRL 2017, vol. 78, PMLR 2017) pp. 119132.Google Scholar
Kiatos, M. and Malassiotis. S.,“Robust object grasping in clutter via singulation,” In: International Conference on Robotics and Automation (ICRA)(2019, pp. 15961600.CrossRefGoogle Scholar
Danielczuk, M., Kurenkov, A., Balakrishna, A. et al., “Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter,” In: International Conference on Robotics and Automation (ICRA) (2019) pp. 16141621.Google Scholar
Danielczuk, M., Angelova, A., Vanhoucke, V. and Goldberg, K., “X-Ray: Mechanical Search for an Occluded Object by Minimizing Support of Learned Occupancy Distributions,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020), pp. 95779584.Google Scholar
Gupta, M. and Sukhatme, G. S., “Using Manipulation Primitives for Brick Sorting in Clutter,” In: IEEE International Conference on Robotics and Automation(ICRA) (2012), pp. 38833889.Google Scholar
Dogar, M. and Srinivasa, S., “A planning framework for Non-Prehensile manipulation under clutter and uncertainty,” Auton. Robot. 33(3), 217236 (2012).CrossRefGoogle Scholar
Boularias, A., Bagnell, J. A. and Stentz, A., “Learning to Manipulate Unknown Objects in Clutter by Reinforcement,” In: The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI’15) (AAAI Press) pp. 1336-1342.Google Scholar
Ibarz, J. Kalashnikov, D., Pastor, P. et al., “Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation,” In: Proceedings of The 2nd Conference on Robot Learning, CoRL 2018, PMLR, vol. 87, (2018) pp. 651673.Google Scholar
Liu, H., Yuan, Y., Deng, Y. et al., “Active Affordance Exploration for Robot Grasping,” In: International Conference on Intelligent Robotics and Applications (ICIRA) (2019), pp. 426438.Google Scholar
Sahbani, A., El-Khoury, S. and Bidaud, P., “An overview of 3D object grasp synthesis algorithms,” Robot. Auton. Syst. 60(3), 326336 (2012).CrossRefGoogle Scholar
Bohg, J., Morales, A., Asfour, T. and Kragic, D., “Data-driven grasp synthesis—A survey,” IEEE Trans. Robot. 30(2), 289309 (2014).CrossRefGoogle Scholar
Marwan, Q., Chua, S. and Kwek, L., “Comprehensive review on reaching and grasping of objects in robotics,” Robotica 39(10), 18491882 (2021).CrossRefGoogle Scholar
Ding, D. and Liu, Wang, S.,“Computing Y., “3-D Optimal Form-Closure Grasps,” In: International Conference on Robotics and Automation (ICRA) (2000), pp. 35733578.Google Scholar
C. Mitash, A. Boularias and K. Bekris, “Robust 6D Object Pose Estimation with Stochastic Congruent Sets,” In: British Machine Vision Conference (2018).Google Scholar
Ponce, J., Sullivan, S., Boissonnat, J. D. and Merlet, J.-P., “On Characterizing and Computing Three- and Four-Finger Force-Closure Grasps of Polyhedral Objects,” In: IEEE International Conference on Robotics and Automation (ICRA) (1993) pp. 821827.Google Scholar
Jiang, Y., Moseson, S. and Saxena, A., “Efficient Grasping from RGBD Images: Learning Using a New Rectangle Representation,” In: IEEE International Conference on Robotics and Automation(ICRA) (2011) pp. 33043311.Google Scholar
Zeng, A., Yu, K., Song, S. et al., “Multi-view Self-Supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge,” In: IEEE International Conference on Robotics and Automation (ICRA) (2017) pp. 1386–1383.Google Scholar
Pinto, L. and Gupta, A., “Supersizing Self-supervision: Learning to Grasp from 50K tries and 700 Robot Hours,” In: IEEE International Conference on Robotics and Automation (ICRA) (2016) pp. 34063413.Google Scholar
Zeng, A., Song, S., Yu, K. and et al., “Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching,” Int. J. Robot. Res. 56(1), 116 (2019).Google Scholar
Choi, C., Schwarting, W., DelPreto, J. and Rus, A., “Learning object grasping for soft robot hands,” IEEE Robot. Automat. Lett. 3(3), 23702377 (2018).CrossRefGoogle Scholar
Mahler, J., Matl, M., Liu, X., Li, A., Gealy, D. and Goldberg, K., “Dex-Net 3.0: Computing Robust Vacuum Suction Grasp Targets in Point Clouds Using a New Analytic Model and Deep Learning,” In: IEEE International Conference on Robotics and Automation (ICRA) (2018) pp. 56205627.Google Scholar
Chu, F., Xu, R. and Patricio, A. V., “Real-World multi-object, multi-grasp detection,” IEEE Robot. Automat. Lett. 3(. 4), 33553362 (2018).CrossRefGoogle Scholar
Fang, H., Wang, C., Gou, M. and Lu, C., “GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping,” In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020) pp. 1144411453.Google Scholar
Omrčen, D., Böge, C., Asfour, T., Ude, A. and Dillmann, R., “Autonomous Acquisition of Pushing Actions to Support Object Grasping with a Humanoid Robot,” In: IEEE-RAS International Conference on Humanoid Robots (2009) pp. 277283.Google Scholar
A. Zeng, S. Song, S. Welker, J. Lee, A. Rodriguez and T. Funkhouser, “Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2018) pp. 42384245.Google Scholar
Jang, E., Devin, C., Vanhoucke, V. and Levine, S., “Grasp2Vec: Learning Object Representations from Self-Supervised Grasping,” In: Proceedings of The 2nd Conference on Robot Learning, CoRL 2018, PMLR, vol. 87, (2018) pp. 99112.Google Scholar
Fang, K., Bai, Y., Hinterstoisser, S., Savarese, S. and Kalakrishnan, M., “Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation,” In: IEEE International Conference on Robotics and Automation (ICRA) (2018) pp. 35163523.Google Scholar
Kurenkov, A., Taglic, J., Kulkarni, R. et al., “Visuomotor Mechanical Search: Learning to Retrieve Target Objects in Clutter,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020) pp. 84088414.Google Scholar
Yang, Y., Liang, H. and Choi, C., “A deep learning approach to grasping the invisible,” IEEE Robot. Automat. Lett. 5(2), 22322239 (2020).CrossRefGoogle Scholar
Cong, R., Lei, J., Fu, H., Cheng, M., Lin, W. and Huang, Q., “Review of visual saliency detection with comprehensive information,” IEEE Trans. Circ. Syst. Vid. 29(10), 29412959 (2019).CrossRefGoogle Scholar
Frintrop, S., Werner, T. and García, G. M., “Traditional Saliency Reloaded: A Good Old Model in New Shape,” In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) pp. 8290.Google Scholar
Itti, L., Koch, C. and Niebur, E., “A model of saliency-based visual attention for rapid scene analysis,” IEEE Trans. Patt. Anal. 20(11), 12541259 (1998).CrossRefGoogle Scholar
Bäcklund, H., Hedblom, A. and N. Neijman, A Density-Based Spatial Clustering of Application with Noise,” In: Linköpings Universitet–ITN, Data Mining TNM033 (2011) pp. 1130.Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D. et al., “Human-level control through deep reinforcement learning,” Nature 518, 529533 (2015).CrossRefGoogle ScholarPubMed
Huang, G., Liu, Z., Maaten, L. V. D. and Weinberger, K. Q., “Densely Connected Convolutional Networks,” In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) pp. 22612269.Google Scholar
Rohmer, E., Singh, S. P. N. and Freese, M., “V-REP: A Versatile and Scalable Robot Simulation Framework,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2013) pp. 13211326.Google Scholar
Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M. and Jagersand, M., “BASNet: Boundary-Aware Salient Object Detection,” In: IEEE Conference on Computer Vision and Patter Recognition (CVPR) (2019) pp. 74797489.Google Scholar
Schaul, T., Quan, J., Antonoglou, I. and Silver, D., “Prioritized Experience Replay,” In: International Conference on Learning Representations (ICLR) (2016).Google Scholar
Bousmalis, K., Silberman, N. D. and etal, D., “Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks,” In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) pp. 37223731.Google Scholar