• Optics and Precision Engineering
  • Vol. 29, Issue 11, 2703 (2021)
Bao-qing GUO1,2,* and Guang-fei XIE1
Author Affiliations
  • 1School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing00044, China
  • 2Frontiers Science Center for Smart High-speed Railway System, Beijing Jiaotong University, Beijing100044, China
  • show less
    DOI: 10.37188/OPE.20212911.2703 Cite this Article
    Bao-qing GUO, Guang-fei XIE. Object detection algorithm based on image and point cloud fusion with N3D_DIOU[J]. Optics and Precision Engineering, 2021, 29(11): 2703 Copy Citation Text show less
    Detection network framework
    Fig. 1. Detection network framework
    Vote model network
    Fig. 2. Vote model network
    Structure of FCN network
    Fig. 3. Structure of FCN network
    Relationship between three detection boxes and target boxes
    Fig. 4. Relationship between three detection boxes and target boxes
    Target box and detection box with angle deviation
    Fig. 5. Target box and detection box with angle deviation
    3D detection AP and recall curves for cars, pedestrians and cyclists
    Fig. 6. 3D detection AP and recall curves for cars, pedestrians and cyclists
    Visualization results of cars
    Fig. 7. Visualization results of cars
    Visualization results of pedestrians and cyclists
    Fig. 8. Visualization results of pedestrians and cyclists

    算法1:N3D_DIOU_loss

    输入:检测框 Bp、目标框Bg 、预测中心 Cp 和目标框中心 Cg

    Bp=(x1py1pz1px2py2pz2p

    Bg=(x1gy1gz1gx2gy2gz2g

    Cp=(xcpycpzcp

    Cg=(xcgycgzcg

    输出:由于目标框与检测框事先与坐标轴对齐,可以确保:x2p>x1py2p>y1pz2p>z1px2g>x1gy2g>y1gz2g>z1g

    1. 计算Bg的体积:Vg=(x2g-x1g)·(y2g-y1g)·(z2g-z1g

     2. 计算Bp的体积:Vp=(x2p-x1p)·(y2p-y1p)·(z2p-z1p

     3. 计算两框交集的体积(Vi):

       x1i=max(x1px1g),x2i=min(x2px2g

      y1i=max(y1py1g),y2i=min(y2py2g

      z1i=max(z1pz1g),z2i=min(z2pz2g

      If x2i>x1iy2i>y1iz2i>z1i

    Vi=(x2i-x1i)·(y2i-y1i)· (z2i-z1i)

      Otherwise: Vi=0

     4. 计算两框最小包围边界框的体积(Vc):

       x1c=min(x1px1g),x2c=max(x2px2g

      y1c=min(y1py1g),y2c=max(y2py2g

      z1c=min(z1pz1g),z2c=max(z2pz2g

      Vc=(x2c-x1c)· (y2c-y1c)· (z2c-z1c)

     5. 计算目标框和检测框的中心之间的距离ρ,以及最小边界框的对角线距离c

      ρ²=(xcp-xcg²+(ycp-ycg²+(zcp-zcg²

      c²=(x2c-x1c²+(y2c-y1c²+(z2c-z1c²

     6. IOU_3D =ViVu, 其中 Vu=Vp+Vg-Vi

     7. DIOU_3D=IOU_3D-ρ2(b,bgt)c2

     8. DIOU_3D_loss=1 - DIOU_3D

     9. N3D_DIOU_loss=

         ω3·DIOU_3D_loss+ω4·L1_angle_loss

    ω3ω4为权重系数,本文中分别设为0.5与0.03,L1_angle_loss是L1损失函数,用于监督角度偏差。)

    Table 1. Algorithm 1 Pseudo Code of N3D_DIOU_loss
    算法汽车行人骑车者
    简单 中等 困难简单 中等 困难简单 中等 困难

    MV3D1

    ContFusion28

    VoxelNet15

    F-PointNet31

    F-ConvNet24

    IPOD29

    PointPillars30

    71.29 62.68 56.56

    86.32 73.25 67.81

    81.97 65.46 62.85

    83.76 70.92 63.65

    89.02 78.80 77.09

    84.10 76.40 75.30

    79.05 74.99 68.30

    N/A N/A N/A

    N/A N/A N/A

    57.86 53.42 48.87

    70.00 61.32 53.59

    N/A N/A N/A

    69.60 62.30 54.60

    52.08 43.53 41.49

    N/A N/A N/A

    N/A N/A N/A

    67.17 47.65 45.11

    77.15 56.49 53.37

    N/A N/A N/A

    81.90 57.10 54.60

    75.78 59.07 52.92

    本文算法89.73 79.43 77.7970.37 58.70 51.7580.88 60.4356.93
    Table 1. 3D detection AP (%) of cars, pedestrians and cyclists on KITTI val set
    算法汽车行人骑车者
    简单 中等 困难简单 中等 困难简单 中等 困难

    MV3D1

    ContFusion28

    VoxelNet15

    F-PointNet31

    F-ConvNet24

    IPOD29

    PointPillars30

    86.55 78.10 76.67

    95.44 87.34 82.43

    89.60 84.81 78.57

    88.16 84.92 76.44

    90.23 88.79 86.84

    88.30 86.40 84.60

    88.35 86.10 79.83

    N/A N/A N/A

    N/A N/A N/A

    65.95 61.05 56.98

    72.38 66.39 59.57

    N/A N/A N/A

    72.40 67.8059.70

    58.66 50.23 47.19

    N/A N/A N/A

    N/A N/A N/A

    74.41 52.18 50.49

    81.82 60.03 56.32

    N/A N/A N/A

    84.30 61.80 57.70

    79.14 62.25 56.00

    本文算法97.51 89.0586.9972.59 63.57 59.2186.21 65.6660.58
    Table 2. BEV detection AP(%) of cars, pedestrians and cyclists on KITTI val set
    算法检测精度
    简单 中等 困难

    3D

    F-ConvNet

    F-ConvNet+投票模型

    F-ConvNet+N3D-DIOU_loss

    F-ConvNet+投票模型+N3D-DIOU_loss

    89.02 78.80 77.09

    89.23 79.06 77.42

    89.34 79.21 77.63

    89.73 79.43 77.79

    BEV

    F-ConvNet

    F-ConvNet+投票模型

    F-ConvNet+N3D-DIOU_loss

    F-ConvNet+投票模型+N3D-DIOU_loss

    90.23 88.79 86.86

    90.53 89.13 86.92

    90.31 88.98 86.63

    97.51 89.05 86.99

    Table 3. 3D and BEV detection performance
    算法检测精度
    简单 中等 困难

    微调

    F-ConvNet

    F-ConvNet+投票模型

    F-ConvNet+N3D-DIOU_loss

    F-ConvNet+投票模型+N3D-DIOU_loss

    86.51 76.57 68.17

    87.73 77.00 68.42

    88.06 77.49 68.76

    88.47 77.83 69.04

    参数

    微调

    F-ConvNet

    F-ConvNet+投票模型

    F-ConvNet+N3D-DIOU_loss

    F-ConvNet+投票模型+N3D-DIOU_loss

    89.02 78.80 77.09

    89.23 79.06 77.42

    89.34 79.21 77.63

    89.73 79.43 77.79

    Table 4. Comparison of parameter tuning experiments
    Bao-qing GUO, Guang-fei XIE. Object detection algorithm based on image and point cloud fusion with N3D_DIOU[J]. Optics and Precision Engineering, 2021, 29(11): 2703
    Download Citation