Hostname: page-component-77c89778f8-vpsfw Total loading time: 0 Render date: 2024-07-20T14:51:42.607Z Has data issue: false hasContentIssue false

A few-shot semantic segmentation method based on adaptively mining correlation network

Published online by Cambridge University Press:  13 March 2023

Zhifu Huang
Affiliation:
School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
Bin Jiang
Affiliation:
School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
Yu Liu*
Affiliation:
School of Automation Science and Engineering, South China University of Technology, Guangzhou, China
*
*Corresponding author. E-mail: auylau@scut.edu.cn

Abstract

The goal of few-shot semantic segmentation is to learn a segmentation model that can segment novel classes in queries when only a few annotated support examples are available. Due to large intra-class variations, the building of accurate semantic correlation remains a challenging job. Current methods typically use 4D kernels to learn the semantic correlation of feature maps. However, they still face the challenge of reducing the consumption of computation and memory while keeping the availability of correlations mined by their methods. In this paper, we propose the adaptively mining correlation network (AMCNet) to alleviate the aforementioned issues. The key points of AMCNet are the proposed adaptive separable 4D kernel and the learnable pyramid correlation module, which form the basic block for correlation encoder and provide a learnable concatenation operation over pyramid correlation tensors, respectively. Experiments on the PASCAL VOC 2012 dataset show that our AMCNet surpasses the state-of-the-art method by $0.7\%$ and $2.2\%$ on 1-shot and 5-shot segmentation scenarios, respectively.

Type
Research Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Simonyan, K. and Zisserman, A., “Very deep convolutional networks for large-scale image recognition” (2014), arXiv: 1409.1556.Google Scholar
Huang, G., Liu, Z., van der Maaten, L. and Weinberger, K. Q., “Densely connected convolutional networks,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017) pp. 47004708.Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., Hu, W. and Torr, P. H. S., “Fast online object tracking and segmentation: A unifying approach,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019) pp. 13281338.Google Scholar
Zhi, J., Luo, D., Li, K., Liu, Y. and Liu, H., “A novel method of shuttlecock trajectory tracking and prediction for a badminton robot,” Robotica 40(6), 16821694 (2022).CrossRefGoogle Scholar
Zare, S., Yazdi, M. R. H., Masouleh, M. T., Zhang, D., Ajami, S. and Ardekani, A. A., “Experimental study on the control of a suspended cable-driven parallel robot for object tracking purpose,” Robotica 40(11), 38633877 (2022).CrossRefGoogle Scholar
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K. and Yuille, A. L., “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834848 (2019).CrossRefGoogle Scholar
Shelhamer, E., Long, J. and Darrell, T., “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640651 (2017).CrossRefGoogle ScholarPubMed
Kenye, L. and Kala, R., “Improving RGB-D SLAM in dynamic environments using semantic aided segmentation,” Robotica 40(6), 20652090 (2022).CrossRefGoogle Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. and Li, F.-F., “Imagenet: A large-scale hierarchical image database,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2009) pp. 248255.Google Scholar
Li, S., Han, K., Costain, T. W., Howard-Jenkins, H. and Prisacariu, V., “Correspondence networks with adaptive neighbourhood consensus,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2020) pp. 1019310202.Google Scholar
Choy, C. B., Gwak, J. Y., Savarese, S. and Chandraker, M., “Universal correspondence network,” In: Proceedings of the International Conference on Neural Information Processing Systems, (2016) pp. 24142422.Google Scholar
Rocco, I., Cimpoi, M., Arandjelović, R., Torii, A., Pajdla, T. and Sivic, J., “Neighbourhood consensus networks,” In: Proceedings of the Advances in Neural Information Processing Systems, (2018) pp. 16511662.Google Scholar
Shaban, A., Bansal, S., Liu, Z., Essa, I. and Bootstitle, B., “One-shot learning for semantic segmentation,” (2017), arXiv: 1709.03410.Google Scholar
Everingham, M., Eslami, S. M. A., van Gool, L., Williams, C. K. I., Winn, J. and Zisserman, A., “The pascal visual object classes challenge: A retrospective,” Int. J. Comput. Vis. 111(1), 98136 (2015).CrossRefGoogle Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K. and Wierstra, D., “Matching networks for one shot learning,” In: Proceedings of the Advances in Neural Information Processing Systems, (2016) pp. 36303638.Google Scholar
Yang, G. and Ramanan, D., “Volumetric correspondence networks for optical flow,” In: Proceedings of the Advances in Neural Information Processing Systems, (2019) pp. 794805.Google Scholar
Tian, Z., Zhao, H., Shu, M., Yang, Z., Li, R. and Jia, J., “Prior guided feature enrichment network for few-shot segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 44(2), 10501065 (2020).CrossRefGoogle Scholar
Wu, Y. and He, K., “Group normalization,” Int. J. Comput. Vis. 128(3), 742755 (2020).CrossRefGoogle Scholar
Zhang, C., Lin, G., Liu, F., Yao, R. and Shen, C., “CANet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2019) pp. 52175226.Google Scholar
Zhang, C., Lin, G., Liu, F., Guo, J., Wu, Q. and Yao, R., “Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019) pp. 95879595.Google Scholar
Wang, K., Liew, J. H., Zou, Y., Zhou, D. and Feng, J., “PANet: Few-shot image semantic segmentation with prototype alignment,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019) pp. 622631.Google Scholar
Li, G., Jampani, V., Sevilla-Lara, L., Sun, D., Kim, J. and Kim, J., “Adaptive prototype learning and allocation for few-shot segmentation,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2021) pp. 83348343.Google Scholar
Boudiaf, M., Kervadec, H., Masud, Z. I., Piantanida, P., Ayed, I. B. and Dolz, J., “Few-shot segmentation without meta-learning: A good transductive inference is all you need?,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2021) pp. 1397913988.Google Scholar
Rakelly, K., Shelhamer, E., Darrell, T., Efros, A. and Levine, S., “Conditional networks for few-shot semantic segmentation,” In: Proceedings of the International Conference on Learning Representations Workshop, (2018).Google Scholar
Xie, G.-S., Liu, J., Xiong, H. and Shao, L., “Scale-aware graph neural network for few-shot semantic segmentation,” In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2021) pp. 54755484.Google Scholar
Xie, G.-S., Xiong, H., Liu, J., Yao, Y. and Shao, L., “Few-shot semantic segmentation with cyclic memory network,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021) pp. 72937302.Google Scholar
Zhang, X., Wei, Y., Yang, Y. and Huang, T. S., “SG-One: Similarity guidance network for one-shot semantic segmentation,” IEEE Trans. Cybern. 50(9), 38553865 (2020).CrossRefGoogle Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C. L., “Microsoft COCO: common objects in context,” In: Proceedings of the European Conference on Computer Vision, (2014) pp. 740755.Google Scholar
Liu, Y., Zhang, X., Zhang, S. and He, X., “Part-aware prototype network for few-shot semantic segmentation,” In: Proceedings of the European Conference on Computer Vision, (2020) pp. 142158.Google Scholar
Yang, B., Liu, C., Li, B., Jiao, J. and Ye, Q., “Prototype mixture models for few-shot semantic segmentation,” In: Proceedings of the European Conference on Computer Vision, (2020) pp. 763778.Google Scholar
Nguyen, K. and Todorovic, S., “Feature weighting and boosting for few-shot segmentation,” In: Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019) pp. 622631.Google Scholar