Grouped Double Attention Network for Semantic Segmentation

Xiaolong Chen; Ji Zhao; Siyi Chen; Xinhao Du; Xin Liu

doi:10.3788/LOP202158.2210007

[1] Cheng X Y, Zhao L Z, Hu Q et al. Real-time semantic segmentation based on dilated convolution smoothing and lightweight up-sampling[J]. Laser & Optoelectronics Progress, 57, 021017(2020).

[2] Li L F, Hu M. Method for small-bridge-crack segmentation based on generative adversarial network[J]. Laser & Optoelectronics Progress, 56, 101004(2019).

[3] Cai Y, Huang X G, Zhang Z A et al. Real-time semantic segmentation algorithm based on feature fusion technology[J]. Laser & Optoelectronics Progress, 57, 021011(2020).

[4] LeCun Y, Bottou L, Bengio Y et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 86, 2278-2324(1998).

[5] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 60, 84-90(2017).

[6] Deng J, Dong W, Socher R et al. ImageNet: a large-scale hierarchical image database[C]. //2009 IEEE Conference on Computer Vision and Pattern Recognition, June 20-25, 2009, Miami, FL, USA, 248-255(2009).

[7] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]. //2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA., 3431-3440(2015).

[8] Zhao H S, Shi J P, Qi X J et al. Pyramid scene parsing network[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA., 6230-6239(2017).

[9] Chen L C, Papandreou G, Kokkinos I et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834-848(2018).

[10] Chen L C, Papandreou G, Schroff F et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. (2017-12-05)[2020-11-05]. https://arxiv.org/abs/1706.05587

[11] Chen L C, Zhu Y K, Papandreou G et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[M]. //Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science, 11211, 833-851(2018).

[12] Peng C, Zhang X Y, Yu G et al. Large kernel matters: improve semantic segmentation by global convolutional network[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 1743-1751(2017).

[13] Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation[M]. //Navab N, Hornegger J, Wells W M, et al. Medical image computing and computer-assisted intervention-MICCAI 2015. Lecture notes in computer science, 9351, 234-241(2015).

[14] Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481-2495(2017).

[15] Wang X L, Girshick R, Gupta A et al. Non-local neural networks[C]. //2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA., 7794-7803(2018).

[16] Fu J, Liu J, Tian H J et al. Dual attention network for scene segmentation[C]. //2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 15-20, 2019, Long Beach, CA, USA., 3141-3149(2019).

[17] Yuan Y H, Wang J D. OCNet: object context network for scene parsing[EB/OL]. (2018-09-04)[2020-11-05]. https://arxiv.org/abs/1809.00916

[18] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 770-778(2016).

[19] Everingham M, Gool L, Williams C K I et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 88, 303-338(2010).

[20] Cordts M, Omran M, Ramos S et al. The cityscapes dataset[C]. //CVPR Workshop on the Future of Datasets in Vision, June 7-12, 2015, Boston, Massachusetts.(2015).

[21] Cordts M, Omran M, Ramos S et al. The cityscapes dataset for semantic urban scene understanding[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 3213-3223(2016).

[22] Zhang H, Dana K, Shi J P et al. Context encoding for semantic segmentation[C]. //2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA., 7151-7160(2018).

[23] Wu Z F, Shen C H, van den Hengel A. Wider or deeper: revisiting the ResNet model for visual recognition[J]. Pattern Recognition, 90, 119-133(2019).

[24] Liu Z W, Li X X, Luo P et al. Semantic image segmentation via deep parsing network[C]. //2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile., 1377-1385(2015).

[25] Chen Y P, Li J N, Xiao H X et al. Dual path networks[C]. //Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 4467-4475(2017).