• Optics and Precision Engineering
  • Vol. 32, Issue 8, 1212 (2024)
Zhiqiang HOU1,2, Minjie CHENG1,2,*, Sugang MA1,2, Minjie QU1,2, and Xiaobao YANG1,2
Author Affiliations
  • 1Xi'an University of Posts and Telecommunications, Institute of Computer, Xi'an702, China
  • 2Xi'an University of Posts and Telecommunications, Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an71011, China
  • show less
    DOI: 10.37188/OPE.20243208.1212 Cite this Article
    Zhiqiang HOU, Minjie CHENG, Sugang MA, Minjie QU, Xiaobao YANG. Real-time urban street view semantic segmentation based on cross-layer aggregation network[J]. Optics and Precision Engineering, 2024, 32(8): 1212 Copy Citation Text show less
    Overall structure of CLANet (Cross-Layer Aggregation Network, CLANet)
    Fig. 1. Overall structure of CLANet (Cross-Layer Aggregation Network, CLANet)
    Cross-Layer Aggregation Module
    Fig. 2. Cross-Layer Aggregation Module
    Comparison diagram between DAPPM and CLA-PPM
    Fig. 3. Comparison diagram between DAPPM and CLA-PPM
    Multi-Scale Fusion Module, MSFM
    Fig. 4. Multi-Scale Fusion Module, MSFM
    Accuracy-speed comparison on the Cityscapes test set
    Fig. 5. Accuracy-speed comparison on the Cityscapes test set
    Accuracy-speed comparison on the CamVid test set
    Fig. 6. Accuracy-speed comparison on the CamVid test set
    Visual segmentation results of the Cityscapes dataset
    Fig. 7. Visual segmentation results of the Cityscapes dataset
    Visual segmentation results of the CamVid dataset
    Fig. 8. Visual segmentation results of the CamVid dataset
    BaselineDAPPMCLA-PPMFLOPs/GParams/MmIoU/%Speed/FPS
    Sparsity=2Sparsity=3
    97.714.274.596
    98.815.5375.283
    98.014.6875.089
    97.914.5274.990
    98.215.075.385
    Table 1. Ablation study of CLA-PPM on the Cityscapes validation
    BaselineCLAMCLA-PPMMSFMFLOPs/GParams/MmIoU/%Speed /FPS
    97.714.274.596
    103.114.575.286
    98.215.075.385
    98.314.2274.893
    103.615.3275.777
    98.815.0475.583
    103.714.5375.484
    104.315.3576.075
    Table 2. Ablation study of CLA-Net on the Cityscapes validation
    MethodReferenceResolutionmIoU/%

    #FPS

    (PyTorch)

    #FPS

    (TensorRT)

    ValTest
    GAS19CVPR2020769×1 53771.8108.4
    HMSeg20BMVC2020768×1 53674.383.2
    DCNet21ICPR2021512×1 02471.2142
    HyperSeg-M22CVPR20211 024×2 04876.275.836.9
    RELAXNet23Neurocomputing2022512×1 02474.864
    FPANet C24APPL INTELL20221 024×2 04875.931
    BiAttnNet25SPL2022512 × 1 02474.789.2
    LETNet26T-ITS2023512×1 02472.8150
    SRDENet27IET COMPUT VIS2023512×1 02475.465
    BiSeNetV228IJCV2021512×1 02473.472.6156
    BiSeNetV2-L28IJCV2021512×1 02475.875.347.3
    FasterSeg29arXiv20191 024×2 04873.171.5163.9
    STDC1-Seg509CVPR2021512×1 02472.271.9250.4
    STDC1-Seg759CVPR2021768×1 53674.575.3126.7
    CPANet-T5030CAC 2022512 × 1 02472.5234.5
    BiSeNetV3-5031Neurocomputing2023512×1 02473.473.5244.3
    CLANet-50(Ours)512×1 02473.373.0143294
    CLANet-75(Ours)768×1 53676.075.875164
    Table 3. Comparison of accuracy and speed of Cityscapes
    MethodReferenceResolutionmIoU/%

    #FPS

    (PyTorch)

    #FPS

    (TensorRT)

    CAS32CVPR2019720×96071.8169
    GAS19CVPR2020720×96072.8153.1
    DSANet33ExpertSyst. Appl. 2021720×96069.975.3
    FSFNet34IEEE T INSTRUM MEAS2021720×96075.191
    RELAXNet23Neurocomputing2022720×96071.279
    FPANet B24APPL INTELL2022720×96072.988
    LETNet26T-ITS2023720×96070.5200
    SRDENet27IET COMPUT VIS2023720×96074.878.3
    BiSeNetV228IJCV2021720×96072.4124.5
    BiSeNetV2-L28IJCV2021720×96073.232.7
    STDC1-Seg509CVPR2021720×96073.0197.6
    CPANet-T30CAC 2022720×96073.9213.9
    BiSeNetV331Neurocomputing2023720×96075.1198.4
    CLANet(Ours)720×96074.8116239
    Table 4. Comparison of accuracy and speed of CamVid
    Zhiqiang HOU, Minjie CHENG, Sugang MA, Minjie QU, Xiaobao YANG. Real-time urban street view semantic segmentation based on cross-layer aggregation network[J]. Optics and Precision Engineering, 2024, 32(8): 1212
    Download Citation