Author Affiliations
Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, Chinashow less
Fig. 1. Principle of calculating the normal of nearest neighbor point sampling method
Fig. 2. Feature connection module based on depth channel attention mechanism
Fig. 3. Overall architecture of monocular depth estimation method
Fig. 4. Architecture of encoder and decoder sub-networks. (a) Sub-network structure of encoder; (b)-(d) subnetwork structures of decoder
Fig. 5. Depth prediction results of different methods on NYU Depth v2 dataset
Fig. 6. 3D reconstruction results based on monocular depth
Fig. 7. Qualitative results of ablation experiments based on network architecture
Fig. 8. Qualitative results of ablation experiments based on constraints
Fig. 9. Quantitative results of test set in range of different depth values
Fig. 10. Quantitative results of selected images in range of different depth values. (a) 10 images with worst RMSE; (b) 10 images with worst REL; (c) 10 images with worst TH1
Method | RMSE | REL | | | |
---|
Ref. [18] | 0.907 | 0.215 | 0.611 | 0.887 | 0.971 | Ref. [37] | 0.824 | 0.230 | 0.614 | 0.883 | 0.971 | Ref. [38] | 0.620 | 0.149 | 0.806 | 0.883 | 0.987 | Ref. [39] | 0.635 | 0.143 | 0.788 | 0.958 | 0.991 | Ref. [40] | 0.819 | 0.232 | 0.646 | 0.892 | 0.968 | Ref. [24] | 0.641 | 0.158 | 0.769 | 0.950 | 0.988 | Ref. [19] | 0.573 | 0.127 | 0.811 | 0.953 | 0.988 | Ref. [22] | 0.586 | 0.121 | 0.811 | 0.954 | 0.987 | Ref. [26] | 0.600 | 0.144 | 0.791 | 0.960 | 0.991 | Ref. [41] | 0.572 | 0.139 | 0.815 | 0.963 | 0.991 | Ref. [27] | 0.599 | 0.159 | 0.772 | 0.942 | 0.984 | Ref. [37] | 0.555 | 0.126 | 0.843 | 0.968 | 0.991 | This paper | 0.552 | 0.164 | 0.768 | 0.940 | 0.984 |
|
Table 1. Quantitative comparison between proposed method and other different methods on NYU Depth v2 dataset
Method | Runing time /ms | Frame rate /(frame·s-1) | RMSE |
---|
Ref. [18] | 23 | 43 | 0.907 | Ref. [19] | 237 | 10 | 0.604 | Ref. [24] | 96 | 6 | 0.753 | This paper | 58 | 17 | 0.552 |
|
Table 2. Comparison of running speeds of different methods
Method | RMSE | REL | | | |
---|
Without skip connect | 0.727 | 0.222 | 0.631 | 0.885 | 0.969 | Without SE_Concat_Block | 0.604 | 0.177 | 0.731 | 0.922 | 0.976 | Baseline | 0.586 | 0.178 | 0.738 | 0.932 | 0.982 | U-net | 0.647 | 0.202 | 0.681 | 0.915 | 0.978 | Resnet-101 | 0.628 | 0.189 | 0.704 | 0.921 | 0.981 |
|
Table 3. Quantitative results of ablation experiments based on network architecture
Method | RMSE | REL | | | |
---|
Baseline | 0.594 | 0.177 | 0.740 | 0.926 | 0.980 | With | 0.586 | 0.178 | 0.738 | 0.932 | 0.982 | With and | 0.561 | 0.165 | 0.761 | 0.935 | 0.983 | With | 0.552 | 0.164 | 0.768 | 0.940 | 0.984 |
|
Table 4. Quantitative results of ablation experiments based on constraints