Fig. 1. Illustration of the network framework
Fig. 2. Comparison of standard convolution and dilated convolution filters. (a) Standard convolution filter; (b) dilated convolution filter with dilation ratio of 2; (c) dilated convolution filter with dilation ratio of 3
Fig. 3. Visualization process comparison of dilated convolution and standard convolution. (a) Visualization process of standard convolution; (b) visualization process of dilated convolution with dilation ratio of 2; (c) visualization process of dilated convolution with dilation ratio of 3
Fig. 4. Flow chart of optimizing global camera pose by ORB-SLAM algorithm
Fig. 5. Projection process of three-dimensional space points onto the image plane
Fig. 6. Curves for different losses. (a) Reconstruction loss; (b) smooth loss; (c) total loss
Fig. 7. Camera pose trajectories for different sequences in the KITTI Odometry dataset. (a) 00; (b) 01; (c) 09; (d) 02; (e) 03; (f) 10
Fig. 8. Qualitative comparison of depth prediction. (a) RGB input image; (b) method of Garg
et al.
[11]; (c) sfmlearner method
[4]; (d) our method; (e) ground truth
Fig. 9. Visualization comparison of depth details. (a)(c) Input images; (b)(d) output images
Method | Sequence 09 | Sequence 10 |
---|
terror /% | rerror per100 m /(°) | terror /% | rerror per100 m /(°) |
---|
Luo et al.[20] | 3.72 | 1.60 | 6.06 | 2.22 | Zhou et al.[4] | 18.77 | 3.21 | 14.33 | 3.30 | Li et al.[21] | 7.01 | 3.61 | 10.63 | 4.65 | Zhanet al.[13] (Tem) | 11.93 | 3.91 | 12.45 | 3.46 | Zhan et al.[13](New YorkUniversitydatasets) | 11.92 | 3.60 | 12.62 | 3.43 | Ours | 1.70 | 0.50 | 1.43 | 0.52 |
|
Table 1. RMSE comparison of 09 and 10 sequences in the KITTI Odometry dataset
Method | Supervised | Data | Error | Accuracy |
---|
A | S | R | lg R | δ1 /% | δ2 /% | δ3 /% |
---|
Method in Ref. [5] | √ | KITTI | 0.214 | 1.605 | 6.563 | 0.292 | 67.3 | 88.4 | 95.7 | Method in Ref. [6] | √ | KITTI | 0.203 | 1.548 | 6.307 | 0.282 | 70.2 | 89.0 | 95.8 | Method in Ref. [7] | √ | KITTI | 0.202 | 1.614 | 6.523 | 0.275 | 67.8 | 89.5 | 96.5 | Method in Ref. [22] (photo) | × | KITTI | 0.211 | 1.980 | 6.154 | 0.264 | 73.2 | 89.8 | 95.9 | Method in Ref. [22] (photo+ad) | × | KITTI | 0.220 | 1.976 | 6.340 | 0.273 | 70.8 | 86.7 | 93.4 | Method in Ref. [4] | × | KITTI | 0.208 | 1.768 | 6.856 | 0.283 | 67.8 | 88.5 | 95.7 | Method in Ref. [4](without explainability masks) | × | KITTI | 0.221 | 2.226 | 7.527 | 0.294 | 67.6 | 88.5 | 95.4 | Ours | × | KITTI | 0.189 | 1.592 | 6.432 | 0.268 | 71.4 | 91.1 | 96.3 |
|
Table 2. Comparison of TUM evaluation results for depth estimation model