• Laser & Optoelectronics Progress
  • Vol. 62, Issue 2, 0212001 (2025)
Yuhan Zhang1,2,3, Miaohua Huang1,2,3,*, Gengyao Chen1,2,3, Yanzhou Li1,2,3, and Yiming Wu1,2,3
Author Affiliations
  • 1Hubei Key Laboratory of Advanced Technology for Automotive Components, Wuhan University of Technology, Wuhan 430070, Hubei , China
  • 2Hubei Collaborative Innovation Center for Automotive Components Technology, Wuhan University of Technology, Wuhan 430070, Hubei , China
  • 3Hubei Research Center for New Energy & Intelligent Connected Vehicle, Wuhan University of Technology, Wuhan 430070, Hubei , China
  • show less
    DOI: 10.3788/LOP240912 Cite this Article Set citation alerts
    Yuhan Zhang, Miaohua Huang, Gengyao Chen, Yanzhou Li, Yiming Wu. Multiview 3D Object Detection Based on Improved DETR3D[J]. Laser & Optoelectronics Progress, 2025, 62(2): 0212001 Copy Citation Text show less
    References

    [1] Li Z Q, Wang W H, Li H Y et al. BEVFormer: learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers[M]. Computer vision-ECCV 2022, 13669, 1-18(2022).

    [2] Li Y H, Ge Z, Yu G Y et al. BEVDepth: acquisition of reliable depth for multi-view 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 1477-1485(2023).

    [3] Liang T T, Xie H W, Yu K C et al. Bevfusion: a simple and robust lidar-camera fusion framework[EB/OL]. https://arxiv.org/abs/2205.13790

    [4] Liu Z J, Tang H T, Amini A et al. BEVFusion: multi-task multi-sensor fusion with unified bird’s-eye view representation[C], 2774-2781(2023).

    [5] Li Z Q, Yu Z D, Wang W H et al. FB-BEV: BEV representation from forward-backward view transformations[C], 6896-6905(2023).

    [6] Philion J, Lift Fidler S., splat. shoot: encoding images from arbitrary camera rigs by implicitly unprojecting to 3[M]. Computer vision-ECCV 2020, 12359, 194-210(2020).

    [7] Wang Y, Guizilini V, Zhang T Y et al. DETR[EB/OL], 3-3. http://arxiv.org/abs/2110.06922v1

    [8] Sun P Z, Zhang R F, Jiang Y et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C], 14449-14458(2021).

    [9] Liu Y F, Wang T C, Zhang X Y et al. PETR: position embedding transformation for multi-view 3D object detection[M]. Computer vision-ECCV 2022, 13687, 531-548(2022).

    [10] Li F, Zhang H, Liu S L et al. DN-DETR: accelerate DETR training by introducing query DeNoising[C], 13609-13617(2022).

    [11] Carion N, Massa F, Synnaeve G et al. End-to-end object detection with transformers[M]. Computer vision-ECCV 2020, 12346, 213-229(2020).

    [12] Vaswani A, Shazeer N, Parmar N et al. Attention is all you need[EB/OL]. https://arxiv.org/abs/1706.03762

    [13] Lin T Y, Dollár P, Girshick R et al. Feature pyramid networks for object detection[C], 936-944(2017).

    [14] Liu Y F, Yan J J, Jia F et al. PETRv2: a unified framework for 3D perception from multi-camera images[C], 3239-3249(2023).

    [15] Chen X S, Shi S S, Zhu B J et al. MPPNet: multi-frame feature intertwining with proxy points for 3D temporal object detection[M]. Computer vision-ECCV 2022, 13668, 680-697(2022).

    [16] Hou J H, Liu Z, Liang D K et al. Query-based temporal fusion with explicit motion for 3D object detection[EB/OL]. https://openreview.net/pdf?id=gySmwdmVDF

    [17] Park J, Xu C F, Yang S J et al. Time will tell: new outlooks and a baseline for temporal multi-view 3D object detection[EB/OL]. https://arxiv.org/abs/2210.02443

    [18] Wang S H, Liu Y F, Wang T C et al. Exploring object-centric temporal modeling for efficient multi-view 3D object detection[C], 3598-3608(2023).

    [19] Caesar H, Bankiti V, Lang A H et al. nuScenes: a multimodal dataset for autonomous driving[C], 11618-11628(2020).

    [20] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C], 770-778(2016).

    [21] Lang A H, Vora S, Caesar H et al. PointPillars: fast encoders for object detection from point clouds[C], 12689-12697(2019).

    [22] Yin T W, Zhou X Y, Krähenbühl P. Center-based 3D object detection and tracking[C], 11779-11788(2021).

    Yuhan Zhang, Miaohua Huang, Gengyao Chen, Yanzhou Li, Yiming Wu. Multiview 3D Object Detection Based on Improved DETR3D[J]. Laser & Optoelectronics Progress, 2025, 62(2): 0212001
    Download Citation