• Acta Optica Sinica
  • Vol. 40, Issue 19, 1910001 (2020)
Xuchu Wang1、2、*, Huihuang Liu2, and Yanmin Niu3
Author Affiliations
  • 1Key Laboratory of Optoelectronic Technology and Systems of Ministry of Education, Chongqing University, Chongqing 400040, China
  • 2College of Optoelectronic Engineering, Chongqing University, Chongqing 400040, China
  • 3College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China
  • show less
    DOI: 10.3788/AOS202040.1910001 Cite this Article Set citation alerts
    Xuchu Wang, Huihuang Liu, Yanmin Niu. Indoor RGB-D Image Semantic Segmentation Based on Dual-Stream Weighted Gabor Convolutional Network Fusion[J]. Acta Optica Sinica, 2020, 40(19): 1910001 Copy Citation Text show less
    RGB-D image semantic segmentation by double-stream weighted Gabor convolution network fusion
    Fig. 1. RGB-D image semantic segmentation by double-stream weighted Gabor convolution network fusion
    Modulation process of WGoFs
    Fig. 2. Modulation process of WGoFs
    Convolution process of WGoFs
    Fig. 3. Convolution process of WGoFs
    Wide residual blocks. (a) Original residual block; (b) wide residual block 1; (c) wide residual block 2
    Fig. 4. Wide residual blocks. (a) Original residual block; (b) wide residual block 1; (c) wide residual block 2
    Architecture of WRN-WGCN module
    Fig. 5. Architecture of WRN-WGCN module
    Pyramid pooling module
    Fig. 6. Pyramid pooling module
    Proposed pyramid pooling feature fusion module
    Fig. 7. Proposed pyramid pooling feature fusion module
    RGB and depth images and their corresponding semantic labels in dataset. (a) RGB images; (b) depth images; (c) semantic labels
    Fig. 8. RGB and depth images and their corresponding semantic labels in dataset. (a) RGB images; (b) depth images; (c) semantic labels
    Loss curves in training process
    Fig. 9. Loss curves in training process
    Test accuracy versus number of scales and number of directions. (a) Test accuracy under different number of scales; (b) test accuracy under different number of directions
    Fig. 10. Test accuracy versus number of scales and number of directions. (a) Test accuracy under different number of scales; (b) test accuracy under different number of directions
    Semantic segmentation results obtained by various methods on NYUDv2 dataset. (a) RGB; (b) depth; (c) GT; (d) baseline; (e) WRN-CNN; (f) WGCN; (g) PP-Fusion; (h) FCN; (i) SegNet; (j) ours
    Fig. 11. Semantic segmentation results obtained by various methods on NYUDv2 dataset. (a) RGB; (b) depth; (c) GT; (d) baseline; (e) WRN-CNN; (f) WGCN; (g) PP-Fusion; (h) FCN; (i) SegNet; (j) ours
    Semantic segmentation results obtained by various methods on SUN-RGBD dataset. (a) RGB; (b) depth; (c) GT; (d) baseline; (e) WRN-CNN; (f) WGCN; (g) PP-Fusion; (h) FCN; (i) SegNet; (j) ours
    Fig. 12. Semantic segmentation results obtained by various methods on SUN-RGBD dataset. (a) RGB; (b) depth; (c) GT; (d) baseline; (e) WRN-CNN; (f) WGCN; (g) PP-Fusion; (h) FCN; (i) SegNet; (j) ours
    Group nameOutput feature sizeBlock type
    GCConv1N×N3×38
    GCConv2N×N3×316×k3×316×k×L
    GCConv3N×N3×316×k3×316×k×L
    GCConv4(N/2)×(N/2)3×332×k3×332×k×L
    Table 1. Structural parameter setting of WRN-WGCN
    Model nameFilter sizeModel size /MB
    Model 15×5163
    Model 25×5124
    Model 33×3148
    Model 43×3117
    Table 2. Model sizes with different filter sizes
    MethodModuleAcc /%mAcc /%mIoU /%FWIoU /%
    WRN-CNNWGCNPP-Fusion
    Ours66.350.840.053.1
    Variant 158.341.630.145.8
    Variant 258.642.431.945.3
    Variant 360.848.235.850.4
    Variant 463.245.836.446.6
    FCN[2]65.445.134.348.6
    SegNet[3]56.247.635.150.1
    Table 3. Comparison of results for different segmentation algorithms on NYUDv2 dataset
    MethodModuleAcc /%mAcc /%mIoU /%FWIoU /%
    WRN-CNNWGCNPP-Fusion
    Ours58.238.528.242.0
    Variant 145.233.721.837.4
    Variant 244.834.523.138.6
    Variant 354.635.127.337.7
    Variant 456.134.626.036.3
    FCN[2]49.536.523.735.8
    SegNet[3]47.834.626.238.2
    Table 4. Comparison of results for different segmentation algorithms on SUN-RGBD dataset
    MethodModuleModel size /MBReasoning time /ms
    WRN-CNNWGCNPP-Fusion
    Ours11742
    Variant 138176
    Variant 211535
    Variant 318748
    Variant 424551
    FCN[2]54943
    SegNet[3]12658
    Table 5. Comparison of reasoning time and space complexity for different algorithms
    Xuchu Wang, Huihuang Liu, Yanmin Niu. Indoor RGB-D Image Semantic Segmentation Based on Dual-Stream Weighted Gabor Convolutional Network Fusion[J]. Acta Optica Sinica, 2020, 40(19): 1910001
    Download Citation