• Laser & Optoelectronics Progress
  • Vol. 57, Issue 6, 061007 (2020)
Renyue Dai, Zhijun Fang*, and Yongbin Gao
Author Affiliations
  • School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201600, China
  • show less
    DOI: 10.3788/LOP57.061007 Cite this Article Set citation alerts
    Renyue Dai, Zhijun Fang, Yongbin Gao. Unsupervised Monocular Depth Estimation by Fusing Dilated Convolutional Network and SLAM[J]. Laser & Optoelectronics Progress, 2020, 57(6): 061007 Copy Citation Text show less
    Illustration of the network framework
    Fig. 1. Illustration of the network framework
    Comparison of standard convolution and dilated convolution filters. (a) Standard convolution filter; (b) dilated convolution filter with dilation ratio of 2; (c) dilated convolution filter with dilation ratio of 3
    Fig. 2. Comparison of standard convolution and dilated convolution filters. (a) Standard convolution filter; (b) dilated convolution filter with dilation ratio of 2; (c) dilated convolution filter with dilation ratio of 3
    Visualization process comparison of dilated convolution and standard convolution. (a) Visualization process of standard convolution; (b) visualization process of dilated convolution with dilation ratio of 2; (c) visualization process of dilated convolution with dilation ratio of 3
    Fig. 3. Visualization process comparison of dilated convolution and standard convolution. (a) Visualization process of standard convolution; (b) visualization process of dilated convolution with dilation ratio of 2; (c) visualization process of dilated convolution with dilation ratio of 3
    Flow chart of optimizing global camera pose by ORB-SLAM algorithm
    Fig. 4. Flow chart of optimizing global camera pose by ORB-SLAM algorithm
    Projection process of three-dimensional space points onto the image plane
    Fig. 5. Projection process of three-dimensional space points onto the image plane
    Curves for different losses. (a) Reconstruction loss; (b) smooth loss; (c) total loss
    Fig. 6. Curves for different losses. (a) Reconstruction loss; (b) smooth loss; (c) total loss
    Camera pose trajectories for different sequences in the KITTI Odometry dataset. (a) 00; (b) 01; (c) 09; (d) 02; (e) 03; (f) 10
    Fig. 7. Camera pose trajectories for different sequences in the KITTI Odometry dataset. (a) 00; (b) 01; (c) 09; (d) 02; (e) 03; (f) 10
    Qualitative comparison of depth prediction. (a) RGB input image; (b) method of Garg et al.[11]; (c) sfmlearner method[4]; (d) our method; (e) ground truth
    Fig. 8. Qualitative comparison of depth prediction. (a) RGB input image; (b) method of Garg et al.[11]; (c) sfmlearner method[4]; (d) our method; (e) ground truth
    Visualization comparison of depth details. (a)(c) Input images; (b)(d) output images
    Fig. 9. Visualization comparison of depth details. (a)(c) Input images; (b)(d) output images
    MethodSequence 09Sequence 10
    terror /%rerror per100 m /(°)terror /%rerror per100 m /(°)
    Luo et al.[20]3.721.606.062.22
    Zhou et al.[4]18.773.2114.333.30
    Li et al.[21]7.013.6110.634.65
    Zhanet al.[13] (Tem)11.933.9112.453.46
    Zhan et al.[13](New YorkUniversitydatasets)11.923.6012.623.43
    Ours1.700.501.430.52
    Table 1. RMSE comparison of 09 and 10 sequences in the KITTI Odometry dataset
    MethodSupervisedDataErrorAccuracy
    ASRlg Rδ1 /%δ2 /%δ3 /%
    Method in Ref. [5]KITTI0.2141.6056.5630.29267.388.495.7
    Method in Ref. [6]KITTI0.2031.5486.3070.28270.289.095.8
    Method in Ref. [7]KITTI0.2021.6146.5230.27567.889.596.5
    Method in Ref. [22] (photo)×KITTI0.2111.9806.1540.26473.289.895.9
    Method in Ref. [22] (photo+ad)×KITTI0.2201.9766.3400.27370.886.793.4
    Method in Ref. [4]×KITTI0.2081.7686.8560.28367.888.595.7
    Method in Ref. [4](without explainability masks)×KITTI0.2212.2267.5270.29467.688.595.4
    Ours×KITTI0.1891.5926.4320.26871.491.196.3
    Table 2. Comparison of TUM evaluation results for depth estimation model
    Renyue Dai, Zhijun Fang, Yongbin Gao. Unsupervised Monocular Depth Estimation by Fusing Dilated Convolutional Network and SLAM[J]. Laser & Optoelectronics Progress, 2020, 57(6): 061007
    Download Citation