Real-Time Semantic Segmentation Network Based on Regional Self-Attention

Hailong Bao; Min Wan; Zhongxiang Liu; Mian Qin; Haoyu Cui

doi:10.3788/LOP202158.0810018

[1] Tang C Y, Pu S L, Ye P Z et al. Fusion of low-illuminance visible and near-infrared images based on convolutional neural networks[J]. Acta Optica Sinica, 40, 1610001(2020).

[2] Kong F Q, Zhou Y B, Shen Q et al. End-to-end multispectral image compression using convolutional neural network[J]. Chinese Journal of Lasers, 46, 1009001(2019).

[3] Liu H, Peng L, Wen J W. Multi-scale aware pedestrian detection algorithm based on improved full convolutional network[J]. Laser & Optoelectronics Progress, 55, 091504(2018).

[4] He Y H, Wang H, Zhang B. Color-based road detection in urban traffic scenes[J]. IEEE Transactions on Intelligent Transportation Systems, 5, 309-318(2004). http://ieeexplore.ieee.org/document/1252047

[5] Yao L S, Xu G M, Zhao F. Facial expression recognition based on local feature fusion of convolutional neural network[J]. Laser & Optoelectronics Progress, 57, 041513(2020).

[6] Zhang Z H, Fang W, Du L L et al. Semantic segmentation of remote sensing image based on encoder-decoder convolutional neural network[J]. Acta Optica Sinica, 40, 0310001(2020).

[7] Zhang X F, Liu J, Shi Z S et al. Review of deep learning-based semantic segmentation[J]. Laser & Optoelectronics Progress, 56, 150003(2019).

[8] Lin G S, Milan A, Shen C H et al. RefineNet: multi-path refinement networks for high-resolution semantic segmentation[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 5168-5177(2017).

[9] Peng C, Zhang X Y, Yu G et al. Large kernel matters: improve semantic segmentation by global convolutional network[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA, 1743-1751(2017).

[10] Zhao H S, Shi J P, Qi X J et al. Pyramid scene parsing network[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA., 6230-6239(2017).

[11] Chen L C, Papandreou G, Schroff F et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. [2019-12-09]. https://arxiv.org/abs/1706.05587

[12] Wang X L, Girshick R, Gupta A et al. Non-local neural networks[C]. //2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA., 7794-7803(2018).

[13] Fu J, Liu J, Tian H J et al. Dual attention network for scene segmentation[C]. //2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 15-20, 2019, Long Beach, CA, USA., 3141-3149(2019).

[14] Yu C Q, Wang J B, Peng C et al. Learning a discriminative feature network for semantic segmentation[C]. //2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA., 1857-1866(2018).

[15] Hu J, Shen L, Albanie S et al. Squeeze-and-excitation networks[EB/OL]. [2020-07-20]. https://arxiv.org/abs/1709.01507

[16] Woo S, Park J, Lee J Y et al. CBAM: convolutional block attention module[EB/OL]. [2020-07-25]. https://arxiv.org/abs/1807.06521

[17] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 770-778(2016).

[18] Chen L C, Papandreou G, Kokkinos I et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[EB/OL]. [2020-07-21]. https: //arxiv.org/abs/1606.00915

[19] Mehta S, Rastegari M, Shapiro L et al. ESPNetv2: a light-weight, power efficient, and general purpose convolutional neural network[C]. //2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 15-20, 2019, Long Beach, CA, USA, 9182-9192(2019).

[20] Cordts M, Omran M, Ramos S et al. The cityscapes dataset for semantic urban scene understanding[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 3213-3223(2016).

[21] Yu C Q, Wang J B, Peng C et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[M]. //Ferrari V, Hebert M, Sminchisescu C, et al. Computer Vision-ECCV 2018. Lecture Notes in Computer Science. Cham: Springer, 11217, 334-349(2018).

[22] Paszke A, Chaurasia A, Kim S et al. ENet: a deep neural network architecture for real-time semantic segmentation[EB/OL]. [2020-07-23]. https://arxiv.org/abs/1606.02147

[23] Mehta S, Rastegari M, Caspi A et al. ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation[M]. //Ferrari V, Hebert M, Sminchisescu C, et al. Computer Vision-ECCV 2018. Lecture Notes in Computer Science. Cham: Springer, 11214, 561-580(2018).

[24] Romera E, Álvarez J M, Bergasa L M et al. ERFNet: efficient residual factorized ConvNet for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 19, 263-272(2018). http://ieeexplore.ieee.org/document/8063438

[25] Zhao H S, Qi X J, Shen X Y et al. ICNet for real-time semantic segmentation on high-resolution images[M]. //Ferrari V, Hebert M, Sminchisescu C, et al. Computer Vision-ECCV 2018. Lecture Notes in Computer Science. Cham: Springer, 11207, 418-434(2018).

[26] Wang Y, Zhou Q, Liu J et al. Lednet: a lightweight encoder-decoder network for real-time semantic segmentation[C]. //2019 IEEE International Conference on Image Processing (ICIP), September 22-25, 2019, Taipei, Taiwan, China, 1860-1864(2019).

[27] Li G, Yun I, Kim J et al. DABNet: depth-wise asymmetric bottleneck for real-time semantic segmentation[EB/OL]. [2020-07-22]. https://arxiv.org/abs/1907.11357

[28] Li H C, Xiong P F, Fan H Q et al. DFANet: deep feature aggregation for real-time semantic segmentation[C]. //2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 15-20, 2019, Long Beach, CA, USA, 9514-9523(2019).