Researching | Multiview 3D Object Detection Based on Improved DETR3D

Journals >Laser & Optoelectronics Progress >Volume 62 >Issue 2 >Page 0212001 > Article

Laser & Optoelectronics Progress
Vol. 62, Issue 2, 0212001 (2025)

Multiview 3D Object Detection Based on Improved DETR3D

Yuhan Zhang^1,2,3, Miaohua Huang^1,2,3,*, Gengyao Chen^1,2,3, Yanzhou Li^1,2,3, and Yiming Wu^1,2,3

Author Affiliations

¹Hubei Key Laboratory of Advanced Technology for Automotive Components, Wuhan University of Technology, Wuhan 430070, Hubei , China

²Hubei Collaborative Innovation Center for Automotive Components Technology, Wuhan University of Technology, Wuhan 430070, Hubei , China

³Hubei Research Center for New Energy & Intelligent Connected Vehicle, Wuhan University of Technology, Wuhan 430070, Hubei , China

show less

DOI: 10.3788/LOP240912 Cite this Article Set citation alerts

Yuhan Zhang, Miaohua Huang, Gengyao Chen, Yanzhou Li, Yiming Wu. Multiview 3D Object Detection Based on Improved DETR3D[J]. Laser & Optoelectronics Progress, 2025, 62(2): 0212001 Copy Citation Text

show less

References

[1] Li Z Q, Wang W H, Li H Y et al. BEVFormer: learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers[M]. Computer vision-ECCV 2022, 13669, 1-18(2022).

[2] Li Y H, Ge Z, Yu G Y et al. BEVDepth: acquisition of reliable depth for multi-view 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 1477-1485(2023).

[3] Liang T T, Xie H W, Yu K C et al. Bevfusion: a simple and robust lidar-camera fusion framework[EB/OL]. https://arxiv.org/abs/2205.13790

[4] Liu Z J, Tang H T, Amini A et al. BEVFusion: multi-task multi-sensor fusion with unified bird’s-eye view representation[C], 2774-2781(2023).

[5] Li Z Q, Yu Z D, Wang W H et al. FB-BEV: BEV representation from forward-backward view transformations[C], 6896-6905(2023).

[6] Philion J, Lift Fidler S., splat. shoot: encoding images from arbitrary camera rigs by implicitly unprojecting to 3[M]. Computer vision-ECCV 2020, 12359, 194-210(2020).

[7] Wang Y, Guizilini V, Zhang T Y et al. DETR[EB/OL], 3-3. http://arxiv.org/abs/2110.06922v1

[8] Sun P Z, Zhang R F, Jiang Y et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C], 14449-14458(2021).

[9] Liu Y F, Wang T C, Zhang X Y et al. PETR: position embedding transformation for multi-view 3D object detection[M]. Computer vision-ECCV 2022, 13687, 531-548(2022).

[10] Li F, Zhang H, Liu S L et al. DN-DETR: accelerate DETR training by introducing query DeNoising[C], 13609-13617(2022).

[11] Carion N, Massa F, Synnaeve G et al. End-to-end object detection with transformers[M]. Computer vision-ECCV 2020, 12346, 213-229(2020).

[12] Vaswani A, Shazeer N, Parmar N et al. Attention is all you need[EB/OL]. https://arxiv.org/abs/1706.03762

[13] Lin T Y, Dollár P, Girshick R et al. Feature pyramid networks for object detection[C], 936-944(2017).

[14] Liu Y F, Yan J J, Jia F et al. PETRv2: a unified framework for 3D perception from multi-camera images[C], 3239-3249(2023).

[15] Chen X S, Shi S S, Zhu B J et al. MPPNet: multi-frame feature intertwining with proxy points for 3D temporal object detection[M]. Computer vision-ECCV 2022, 13668, 680-697(2022).

[16] Hou J H, Liu Z, Liang D K et al. Query-based temporal fusion with explicit motion for 3D object detection[EB/OL]. https://openreview.net/pdf?id=gySmwdmVDF

[17] Park J, Xu C F, Yang S J et al. Time will tell: new outlooks and a baseline for temporal multi-view 3D object detection[EB/OL]. https://arxiv.org/abs/2210.02443

[18] Wang S H, Liu Y F, Wang T C et al. Exploring object-centric temporal modeling for efficient multi-view 3D object detection[C], 3598-3608(2023).

[19] Caesar H, Bankiti V, Lang A H et al. nuScenes: a multimodal dataset for autonomous driving[C], 11618-11628(2020).

[20] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C], 770-778(2016).

[21] Lang A H, Vora S, Caesar H et al. PointPillars: fast encoders for object detection from point clouds[C], 12689-12697(2019).

[22] Yin T W, Zhou X Y, Krähenbühl P. Center-based 3D object detection and tracking[C], 11779-11788(2021).

Get PDF(in Chinese)

Figures&Tables (9)

References (22)

Copy Citation Text

Yuhan Zhang, Miaohua Huang, Gengyao Chen, Yanzhou Li, Yiming Wu. Multiview 3D Object Detection Based on Improved DETR3D[J]. Laser & Optoelectronics Progress, 2025, 62(2): 0212001

Download Citation

Set citation alerts for the article

Set citation alerts for the article

Save the article for my favorites

Paper Information

Recommended Topics

laser devices and laser physics

Lasers and Laser Optics

laser manufacturing

Instrumentation, Measurement and Metrology

Set citation alerts for the article

Please enter your email address