• Acta Optica Sinica
  • Vol. 40, Issue 9, 0915005 (2020)
Kangru Wang1、2、*, Jingang Tan1、2, Liang Du3, Lili Chen1, Jiamao Li1, and Xiaolin Zhang1
Author Affiliations
  • 1Bionic Vision System Laboratory, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
  • 2University of Chinese Academy of Sciences, Beijing, 100049, China
  • 3Key Laboratory of Computational Neuroscience and Brain Inspired Intelligence, Ministry of Education, Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
  • show less
    DOI: 10.3788/AOS202040.0915005 Cite this Article Set citation alerts
    Kangru Wang, Jingang Tan, Liang Du, Lili Chen, Jiamao Li, Xiaolin Zhang. 3D Object Detection Based on Iterative Self-Training[J]. Acta Optica Sinica, 2020, 40(9): 0915005 Copy Citation Text show less
    Flow chart of 3D object detection system
    Fig. 1. Flow chart of 3D object detection system
    Architectural diagram of IST-Net
    Fig. 2. Architectural diagram of IST-Net
    Flow chart of iterative self-training
    Fig. 3. Flow chart of iterative self-training
    Architectural diagram of SAFF-3DOD Net
    Fig. 4. Architectural diagram of SAFF-3DOD Net
    Diagram of SAFFM
    Fig. 5. Diagram of SAFFM
    Qualitative comparison of baseline and our method on estimated disparity map. (a) RGB left image; (b) PSMNET method; (c) our disparity estimation method
    Fig. 6. Qualitative comparison of baseline and our method on estimated disparity map. (a) RGB left image; (b) PSMNET method; (c) our disparity estimation method
    Qualitative comparison of baseline and our method on estimated point cloud. (a) RGB left image; (b) PSMNET method; (c) our disparity estimation method
    Fig. 7. Qualitative comparison of baseline and our method on estimated point cloud. (a) RGB left image; (b) PSMNET method; (c) our disparity estimation method
    Qualitative comparison of 3D object detection results. (a) Pseudo- LiDAR; (b) our method
    Fig. 8. Qualitative comparison of 3D object detection results. (a) Pseudo- LiDAR; (b) our method
    ParameterSAFFM in region proposal networkSAFFM in detection network
    Layer settingOutput dimensionLayer settingOutput dimension
    FRGB/FBEV3×3×17×7×32
    L01--1×17×7×1
    IRGB/IBEV9×1×149×1×1
    L136--1×136×1×198--1×198×1×1
    L236--1×136×1×198--1×198×1×1
    L318--1×118×1×149--1×149×1×1
    L49--1×19×1×149--1×149×1×1
    Sigmoid9×1×149×1×1
    Spatial-attention map3×3×17×7×1
    Weighted FRGB and weighted FBEV3×3×17×7×32
    Foutput3×3×17×7×32
    Table 1. Detailed configuration of SAFFM
    MethodDisparity error rate /%
    Object regionBackground regionGlobal image
    PSMNET(base)8.964.355.49
    Ours(IST)8.694.185.27
    Ours(SOL)8.724.205.30
    Ours(IST+SOL)8.604.175.25
    Table 2. Quantitative comparison of disparity estimation on KITTI 3D object detection validation set
    MethodDisparity error rate /%
    Object regionBackground regionGlobal image
    Stereonet11.145.236.99
    PSMNET7.233.334.44
    Ours(IST+SOL)6.833.204.27
    Table 3. Quantitative comparison of disparity estimation on KITTI stereo matching validation set
    MethodIoU is 0.5IoU is 0.7
    EasyModerateHardEasyModerateHard
    Pseudo-LiDAR(base)92.1/91.678.3/75.366.7/63.875.6/61.555.6/43.348.3/36.8
    Ours(IST)92.1/91.080.4/77.470.8/67.877.5/61.359.5/43.350.6/36.9
    Ours(SOL)92.3/91.580.6/75.969.1/66.278.8/63.158.2/43.750.1/37.4
    Ours(IST+SOL)92.1/91.481.0/78.069.2/66.378.4/63.559.6/45.050.8/38.6
    Ours(SAFF)92.0/91.578.3/75.468.5/65.577.7/63.057.1/43.348.6/37.0
    Ours94.5/92.581.6/78.673.6/70.780.9/65.860.7/46.152.3/39.4
    Table 4. Quantitative comparison of 3D object detection on KITTI 3D object detection validation set (units of ABEV and A3D are both %)
    MethodInputEasyModerateHard
    MonoPSR[7]Monocular18.33/10.7612.58/7.259.91/5.85
    Mono3D_PLiDAR[8]Monocular21.27/10.7613.92/7.5011.25/6.10
    TopNet-HighRes[2]Lidar67.84/12.6753.05/9.2846.99/7.95
    M3D-RPN[9]Monocular21.02/14.7613.67/9.7110.23/7.42
    AM3D[10]Monocular25.03/16.5017.32/10.4714.91/9.52
    RT3D[3]Lidar56.44/23.7444.00/19.1442.34/18.86
    RT3DStereo[13]Stereo58.81/29.9046.82/23.2838.38/18.96
    Stereo R-CNN[27]Stereo61.92/47.5841.31/30.2333.42/23.72
    Pseudo-LiDAR[12]Stereo67.30/54.5345.00/34.0538.40/28.25
    OursStereo71.47/58.7049.61/37.9242.71/31.99
    Table 5. 3D object detection results on KITTI test benchmark(units of ABEV and A3D are both %)
    Kangru Wang, Jingang Tan, Liang Du, Lili Chen, Jiamao Li, Xiaolin Zhang. 3D Object Detection Based on Iterative Self-Training[J]. Acta Optica Sinica, 2020, 40(9): 0915005
    Download Citation