Researching | Design of Swin Transformer for semantic segmentation of road scenes

Journals >Opto-Electronic Engineering >Volume 51 >Issue 1 >Page 230304-1 > Article

Opto-Electronic Engineering
Vol. 51, Issue 1, 230304-1 (2024)

Design of Swin Transformer for semantic segmentation of road scenes

Hao Hang, Yingping Huang^*, Xurui Zhang, and Xin Luo

Author Affiliations

School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

show less

DOI: 10.12086/oee.2024.230304 Cite this Article

Hao Hang, Yingping Huang, Xurui Zhang, Xin Luo. Design of Swin Transformer for semantic segmentation of road scenes[J]. Opto-Electronic Engineering, 2024, 51(1): 230304-1 Copy Citation Text

show less

References

[1] Y J Mo, Y Wu, X N Yang et al. Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing, 493, 626-646(2022).

[2] X L Liu, Z D Deng, Y H Yang. Recent progress in semantic image segmentation. Artif Intell Rev, 52, 1089-1106(2019).

[3] Y Zhang, Y P Huang, Z Y Guo et al. Point cloud-image data fusion for road segmentation. Opto-Electron Eng, 48, 210340(2021).

[4] M T Chiu, X Q Xu, Y C Wei et al. Agriculture-vision: a large aerial image database for agricultural pattern analysis, 2825-2835(2020). https://doi.org/10.1109/CVPR42600.2020.00290

[5] I Qureshi, J H Yan, Q Abbas et al. Medical image segmentation using deep semantic-based methods: a review of techniques, applications and emerging trends. Inf Fusion, 90, 316-352(2023).

[6] L O Chua, T Roska. The CNN paradigm. IEEE Trans Circuits Syst I:Fundam Theory Appl, 40, 147-156(1993).

[7] A Vaswani, N Shazeer, N Parmar et al. Attention is all you need, 6000-6010(2017).

[8] J Long, E Shelhamer, T Darrell. Fully convolutional networks for semantic segmentation, 3431-3440(2015). https://doi.org/10.1109/CVPR.2015.7298965

[9] O Ronneberger, P Fischer, T Brox. U-Net: convolutional networks for biomedical image segmentation, 234-241(2015). https://doi.org/10.1007/978-3-319-24574-4_28

[10] H S Zhao, J P Shi, X J Qi et al. Pyramid scene parsing network, 6230-6239(2017). https://doi.org/10.1109/CVPR.2017.660

[11] L C Chen, G Papandreou, I Kokkinos et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs(2015).

[12] L C Chen, G Papandreou, I Kokkinos et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell, 40, 834-848(2018).

[13] L C Chen, G Papandreou, F Schroff et al. Rethinking atrous convolution for semantic image segmentation(2017). https://doi.org/10.48550/arXiv.1706.05587

[14] L C Chen, Y K Zhu, G Papandreou et al. Encoder-decoder with atrous separable convolution for semantic image segmentation, 833-851(2018). https://doi.org/10.1007/978-3-030-01234-2_49

[15] A Dosovitskiy, L Beyer, A Kolesnikov et al. An image is worth 16x16 words: transformers for image recognition at scale(2021).

[16] Z Liu, Y T Lin, Y Cao et al. Swin transformer: hierarchical vision transformer using shifted windows, 9992-10002(2021). https://doi.org/10.1109/ICCV48922.2021.00986

[17] T Y Lin, P Dollár, R Girshick et al. Feature pyramid networks for object detection, 936-944(2017). https://doi.org/10.1109/CVPR.2017.106

[18] S N Xie, Z W Tu. Holistically-nested edge detection, 1395-1403(2015). https://doi.org/10.1109/ICCV.2015.164

[19] P F Felzenszwalb, D P Huttenlocher. Efficient graph-based image segmentation. Int J Comput Vis, 59, 167-181(2004).

[20] U Sehar, M L Naseem. How deep learning is empowering semantic segmentation: traditional and deep learning techniques for semantic segmentation: a comparison. Multimed Tools Appl, 81, 30519-30544(2022).

[21] A Krizhevsky, I Sutskever, G E Hinton. ImageNet classification with deep convolutional neural networks, 1106-1114(2012).

[22] K M He, X Y Zhang, S Q Ren et al. Deep residual learning for image recognition, 770-778(2016). https://doi.org/10.1109/CVPR.2016.90

[23] A Stergiou, R Poppe, G Kalliatakis. Refining activation downsampling with SoftPool, 10337-10346(2021). https://doi.org/10.1109/ICCV48922.2021.01019

[24] L Ma, Y T Gou, T Lei et al. Small object detection based on multi-scale feature fusion using remote sensing images. Opto-Electron Eng, 49, 210363(2022).

[25] M Cordts, M Omran, S Ramos et al. The cityscapes dataset for semantic urban scene understanding, 3213-3223(2016). https://doi.org/10.1109/CVPR.2016.350

[26] I Ulku, E Akagündüz. A survey on deep learning-based architectures for semantic segmentation on 2D images. Appl Artif Intell, 36, 2032924(2022).

Get PDF(in Chinese)

Figures&Tables (13)

References (26)

Copy Citation Text

Hao Hang, Yingping Huang, Xurui Zhang, Xin Luo. Design of Swin Transformer for semantic segmentation of road scenes[J]. Opto-Electronic Engineering, 2024, 51(1): 230304-1

Download Citation

Save the article for my favorites

Paper Information

Recommended Topics

laser devices and laser physics

Lasers and Laser Optics

laser manufacturing

Instrumentation, Measurement and Metrology

Set citation alerts for the article

Please enter your email address