A few-shot semantic segmentation method based on adaptively mining correlation network

Zhifu Huang; Bin Jiang; Yu Liu

doi:10.1017/S0263574723000206

A few-shot semantic segmentation method based on adaptively mining correlation network

Published online by Cambridge University Press: 13 March 2023

and

Zhifu Huang: Affiliation:
School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
Bin Jiang: Affiliation:
School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
Yu Liu*: Affiliation:
School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
*: *Corresponding author. E-mail: auylau@scut.edu.cn

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The goal of few-shot semantic segmentation is to learn a segmentation model that can segment novel classes in queries when only a few annotated support examples are available. Due to large intra-class variations, the building of accurate semantic correlation remains a challenging job. Current methods typically use 4D kernels to learn the semantic correlation of feature maps. However, they still face the challenge of reducing the consumption of computation and memory while keeping the availability of correlations mined by their methods. In this paper, we propose the adaptively mining correlation network (AMCNet) to alleviate the aforementioned issues. The key points of AMCNet are the proposed adaptive separable 4D kernel and the learnable pyramid correlation module, which form the basic block for correlation encoder and provide a learnable concatenation operation over pyramid correlation tensors, respectively. Experiments on the PASCAL VOC 2012 dataset show that our AMCNet surpasses the state-of-the-art method by $0.7\%$ and $2.2\%$ on 1-shot and 5-shot segmentation scenarios, respectively.

Keywords

computer vision deep learning convolutional neural network few-shot semantic segmentation intelligent system

Type: Research Article
Information: Robotica , Volume 41 , Issue 6 , June 2023 , pp. 1828 - 1836

DOI: https://doi.org/10.1017/S0263574723000206 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Simonyan, K. and Zisserman, A., “Very deep convolutional networks for large-scale image recognition” (2014), arXiv: 1409.1556.Google Scholar

Huang, G., Liu, Z., van der Maaten, L. and Weinberger, K. Q., “Densely connected convolutional networks,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017) pp. 4700–4708.Google Scholar

Wang, Q., Zhang, L., Bertinetto, L., Hu, W. and Torr, P. H. S., “Fast online object tracking and segmentation: A unifying approach,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019) pp. 1328–1338.Google Scholar

Zhi, J., Luo, D., Li, K., Liu, Y. and Liu, H., “A novel method of shuttlecock trajectory tracking and prediction for a badminton robot,” Robotica 40(6), 1682–1694 (2022).CrossRef Google Scholar

Zare, S., Yazdi, M. R. H., Masouleh, M. T., Zhang, D., Ajami, S. and Ardekani, A. A., “Experimental study on the control of a suspended cable-driven parallel robot for object tracking purpose,” Robotica 40(11), 3863–3877 (2022).CrossRef Google Scholar

Chen, L., Papandreou, G., Kokkinos, I., Murphy, K. and Yuille, A. L., “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2019).CrossRef Google Scholar

Shelhamer, E., Long, J. and Darrell, T., “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017).CrossRef Google Scholar PubMed

Kenye, L. and Kala, R., “Improving RGB-D SLAM in dynamic environments using semantic aided segmentation,” Robotica 40(6), 2065–2090 (2022).CrossRef Google Scholar

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. and Li, F.-F., “Imagenet: A large-scale hierarchical image database,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2009) pp. 248–255.Google Scholar

Li, S., Han, K., Costain, T. W., Howard-Jenkins, H. and Prisacariu, V., “Correspondence networks with adaptive neighbourhood consensus,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2020) pp. 10193–10202.Google Scholar

Choy, C. B., Gwak, J. Y., Savarese, S. and Chandraker, M., “Universal correspondence network,” In: Proceedings of the International Conference on Neural Information Processing Systems, (2016) pp. 2414–2422.Google Scholar

Rocco, I., Cimpoi, M., Arandjelović, R., Torii, A., Pajdla, T. and Sivic, J., “Neighbourhood consensus networks,” In: Proceedings of the Advances in Neural Information Processing Systems, (2018) pp. 1651–1662.Google Scholar

Shaban, A., Bansal, S., Liu, Z., Essa, I. and Bootstitle, B., “One-shot learning for semantic segmentation,” (2017), arXiv: 1709.03410.Google Scholar

Everingham, M., Eslami, S. M. A., van Gool, L., Williams, C. K. I., Winn, J. and Zisserman, A., “The pascal visual object classes challenge: A retrospective,” Int. J. Comput. Vis. 111(1), 98–136 (2015).CrossRef Google Scholar

Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K. and Wierstra, D., “Matching networks for one shot learning,” In: Proceedings of the Advances in Neural Information Processing Systems, (2016) pp. 3630–3638.Google Scholar

Yang, G. and Ramanan, D., “Volumetric correspondence networks for optical flow,” In: Proceedings of the Advances in Neural Information Processing Systems, (2019) pp. 794–805.Google Scholar

Tian, Z., Zhao, H., Shu, M., Yang, Z., Li, R. and Jia, J., “Prior guided feature enrichment network for few-shot segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 44(2), 1050–1065 (2020).CrossRef Google Scholar

Wu, Y. and He, K., “Group normalization,” Int. J. Comput. Vis. 128(3), 742–755 (2020).CrossRef Google Scholar

Zhang, C., Lin, G., Liu, F., Yao, R. and Shen, C., “CANet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019) pp. 5217–5226.Google Scholar

Zhang, C., Lin, G., Liu, F., Guo, J., Wu, Q. and Yao, R., “Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019) pp. 9587–9595.Google Scholar

Wang, K., Liew, J. H., Zou, Y., Zhou, D. and Feng, J., “PANet: Few-shot image semantic segmentation with prototype alignment,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019) pp. 622–631.Google Scholar

Li, G., Jampani, V., Sevilla-Lara, L., Sun, D., Kim, J. and Kim, J., “Adaptive prototype learning and allocation for few-shot segmentation,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2021) pp. 8334–8343.Google Scholar

Boudiaf, M., Kervadec, H., Masud, Z. I., Piantanida, P., Ayed, I. B. and Dolz, J., “Few-shot segmentation without meta-learning: A good transductive inference is all you need?,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2021) pp. 13979–13988.Google Scholar

Rakelly, K., Shelhamer, E., Darrell, T., Efros, A. and Levine, S., “Conditional networks for few-shot semantic segmentation,” In: Proceedings of the International Conference on Learning Representations Workshop, (2018).Google Scholar

Xie, G.-S., Liu, J., Xiong, H. and Shao, L., “Scale-aware graph neural network for few-shot semantic segmentation,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2021) pp. 5475–5484.Google Scholar

Xie, G.-S., Xiong, H., Liu, J., Yao, Y. and Shao, L., “Few-shot semantic segmentation with cyclic memory network,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021) pp. 7293–7302.Google Scholar

Zhang, X., Wei, Y., Yang, Y. and Huang, T. S., “SG-One: Similarity guidance network for one-shot semantic segmentation,” IEEE Trans. Cybern. 50(9), 3855–3865 (2020).CrossRef Google Scholar

Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C. L., “Microsoft COCO: common objects in context,” In: Proceedings of the European Conference on Computer Vision, (2014) pp. 740–755.Google Scholar

Liu, Y., Zhang, X., Zhang, S. and He, X., “Part-aware prototype network for few-shot semantic segmentation,” In: Proceedings of the European Conference on Computer Vision, (2020) pp. 142–158.Google Scholar

Yang, B., Liu, C., Li, B., Jiao, J. and Ye, Q., “Prototype mixture models for few-shot semantic segmentation,” In: Proceedings of the European Conference on Computer Vision, (2020) pp. 763–778.Google Scholar

Nguyen, K. and Todorovic, S., “Feature weighting and boosting for few-shot segmentation,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019) pp. 622–631.Google Scholar

Article contents

A few-shot semantic segmentation method based on adaptively mining correlation network

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests