• High Power Laser and Particle Beams
  • Vol. 37, Issue 5, 051001 (2025)
Tianlong Zhang, Yuanchao Geng, Yuzhen Liao, and Dangpeng Xu*
Author Affiliations
  • Laser Fusion Research Center, CAEP, Mianyang 621900, China
  • show less
    DOI: 10.11884/HPLPB202537.240370 Cite this Article
    Tianlong Zhang, Yuanchao Geng, Yuzhen Liao, Dangpeng Xu. A review of multispectral target detection algorithms and related datasets[J]. High Power Laser and Particle Beams, 2025, 37(5): 051001 Copy Citation Text show less
    References

    [1] Voulodimos A, Doulamis N, Doulamis A et al. Deep learning for computer vision: a brief review[J]. Computational Intelligence and Neuroscience, 2018, 7068349(2018).

    [2] Li Ke, Wan Gang, Cheng Gong et al. Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 159, 296-307(2020).

    [3] Himeur Y, Rimal B, Tiwary A et al. Using artificial intelligence and data fusion for environmental monitoring: a review and future perspectives[J]. Information Fusion, 86/87, 44-75(2022).

    [4] Janakiramaiah B, Kalyani G, Karuna A et al. Retracted article: military object detection in defense using multi-level capsule networks[J]. Soft Computing, 27, 1045-1059(2023).

    [5] Ren Shaoqing, He Kaiming, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

    [6] Feng Di, Haase-Schütz C, Rosenbaum L et al. Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges[J]. IEEE Transactions on Intelligent Transportation Systems, 22, 1341-1360(2021).

    [7] Sun Wei, Dai Liang, Zhang Xiaorui et al. RSOD: real-time small object detection algorithm in UAV-based traffic monitoring[J]. Applied Intelligence, 52, 8448-8463(2022).

    [8] Ghasemi Y, Jeong H, Choi S H et al. Deep learning-based object detection in augmented reality: a systematic review[J]. Computers in Industry, 139, 103661(2022).

    [9] Li Yongjun, Li Shasha, Du Haohao et al. YOLO-ACN: focusing on small target and occluded object detection[J]. IEEE Access, 8, 227288-227303(2020).

    [10] Li Haoyuan, Hu Qi, Yao You, et al. CFMW: crossmodality fusion mamba f multispectral object detection under adverse weather conditions[DBOL]. arXiv preprint arXiv: 2404.16302, 2024.

    [11] Guan Dayan, Cao Yanpeng, Yang Jiangxin et al. Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection[J]. Information Fusion, 50, 148-157(2019).

    [12] Li Chengyang, Song Dan, Tong Ruofeng et al. Illumination-aware faster R-CNN for robust multispectral pedestrian detection[J]. Pattern Recognition, 85, 161-171(2019).

    [13] Zhou Kailai, Chen Linsen, Cao Xun. Improving multispectral pedestrian detection by addressing modality imbalance problems[C]Proceedings of the 16th European Conference on Computer Vision. 2020: 787803.

    [14] Liu Ye, Meng Shiyang, Wang Hongzhang et al. Deep learning based object detection from multi-modal sensors: an overview[J]. Multimedia Tools and Applications, 83, 19841-19870(2024).

    [15] FLIR Thermal Dataset[DBOL]. [2023]. https:www.flir.comoemadasadasdatasetfm.

    [16] Jia Xinyu, Zhu Chuang, Li Minzhen, et al. LLVIP: a visibleinfrared paired dataset f lowlight vision[C]Proceedings of 2021 IEEECVF International Conference on Computer Vision Wkshops. 2021: 34893497.

    [17] Paolo Gamba. Pavia Centra[DBOL]. [2010]. http:tlclab.unipv.it.

    [18] Pursue’s university MultiSpecsite[DBOL]. [1992]. https:engineering.purdue.edu~biehlMultiSpechyperspectral.html.

    [19] Zhang Jiaqing, Lei Jie, Xie Weiying et al. SuperYOLO: super resolution assisted object detection in multimodal remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 61, 5605415(2023).

    [20] Razakarivony S, Jurie F. Vehicle detection in aerial imagery: a small target detection benchmark[J]. Journal of Visual Communication and Image Representation, 34, 187-203(2016).

    [21] Hwang S, Park J, Kim N, et al. Multispectral pedestrian detection: benchmark dataset baseline[C]Proceedings of 2015 IEEE Conference on Computer Vision Pattern Recognition. 2015: 10371045.

    [22] Liu Jingjing, Zhang Shaoting, Wang Shu, et al. Multispectral deep neural wks f pedestrian detection[C]Proceedings of British Machine Vision Conference 2016. 2016.

    [23] Li Chengyang, Song Dan, Tong Ruofeng, et al. Multispectral pedestrian detection via simultaneous detection segmentation[C]Proceedings of British Machine Vision Conference 2018. 2018.

    [24] Xu Lizhi. Research on imaging quality f airbne sweeping hyperspectral imager[D]. Changchun: University of Chinese Academy of Sciences (Changchun Institute of Optics, Fine Mechanics Physics, Chinese Academy of Sciences), 2020: 2

    [25] Yu Lei. Development and application of imaging spectrometer (Invited)[J]. Infrared and Laser Engineering, 51, 20210940(2022).

    [26] Li Yue, Yang Cankun, Zhou Chunping. Advance and application of UAV hyperspectral imaging equipment[J]. Bulletin of Surveying and Mapping, 1-6,17(2019).

    [27] Saline[DBOL]. [2001]. https:www.ehu.eusccwintcoindex.phpHyperspectral_Remote_Sensing_Scenes.

    [28] Green R O, Eastwood M L, Sarture C M et al. Imaging spectroscopy and the airborne visible/infrared imaging spectrometer (AVIRIS)[J]. Remote Sensing of Environment, 65, 227-248(1998).

    [29] Helber P, Bischke B, Dengel A et al. EuroSAT: a novel dataset and deep learning benchmark for land use and land cover classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12, 2217-2226(2019).

    [30] Martimt P, Fernez V, Kirschner V, et al. Sentinel2 MultiSpectral imager (MSI) calibrationvalidation[C]Proceedings of 2012 IEEE International Geoscience Remote Sensing Symposium. 2012: 69997002.

    [31] Chen Yupeng. They, design experiment of snapshot infrared Fourier transfm imaging spectrometer[D]. Changchun: University of Chinese Academy of Sciences (Changchun Institute of Optics, Fine Mechanics Physics, Chinese Academy of Sciences), 2022: 3

    [32] Lin C H, Huang S H, Lin T H et al. Metasurface-empowered snapshot hyperspectral imaging with convex/deep (CODE) small-data learning theory[J]. Nature Communications, 14, 6979(2023).

    [33] Miao Xin, Yuan Xin, Pu Yunchen, et al. Lambda: reconstruct hyperspectral images from a snapshot measurement[C]Proceedings of 2019 IEEECVF International Conference on Computer Vision. 2019: 40584068.

    [34] Yimoto K, Han Xianhua. HyperMix: hyperspectral image reconstruction with deep mixed wk from a snapshot measurement[C]Proceedings of 2021 IEEECVF International Conference on Computer Vision. 2021: 11841193.

    [35] Cao Xun, Yue Tao, Lin Xing et al. Computational snapshot multispectral cameras: toward dynamic capture of the spectral world[J]. IEEE Signal Processing Magazine, 33, 95-108(2016).

    [36] Lucieer A, Malenovský Z, Veness T et al. HyperUAS—imaging spectroscopy from a multirotor unmanned aircraft system[J]. Journal of Field Robotics, 31, 571-590(2014).

    [37] Hruska R, Mitchell J, Anderson M et al. Radiometric and geometric analysis of hyperspectral imagery acquired from an unmanned aerial vehicle[J]. Remote Sensing, 4, 2736-2752(2012).

    [38] Bernath P F. Spectra of atoms molecules[M]. 4th ed. Oxfd: Oxfd University Press, 2020.

    [39] Adão T, Hruška J, Pádua L et al. Hyperspectral imaging: a review on UAV-based sensors, data processing and applications for agriculture and forestry[J]. Remote Sensing, 9, 1110(2017).

    [40] Yadav A K, Roy R, Kumar R, et al. Algithm f denoising of col images based on median filter[C]Proceedings of the 2015 3rd International Conference on Image Infmation Processing. 2015: 428432.

    [41] Peng Honghong, Rao R, Dianat S A. Multispectral image denoising with optimized vector bilateral filter[J]. IEEE Transactions on Image Processing, 23, 264-273(2014).

    [42] Ojha U, Garg A. Denoising high resolution multispectral images using deep learning approach[C]Proceedings of the 2016 15th IEEE International Conference on Machine Learning Applications. 2016: 871875.

    [43] Dai Xiaoai, He Xuwei, Guo Shouheng et al. Research on hyper-spectral remote sensing image classification by applying stacked de-noising auto-encoders neural network[J]. Multimedia Tools and Applications, 80, 21219-21239(2021).

    [44] Lin T Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context[C]Proceedings of the 13th European Conference on Computer Vision. 2014: 740755.

    [45] riluka M, Pishchulin L, Gehler P, et al. 2D human pose estimation: new benchmark state of the art analysis[C]Proceedings of 2014 IEEE Conference on Computer Vision Pattern Recognition. 2014: 36863693.

    [46] Everingham M, Eslami S M A, Van Gool L et al. The PASCAL visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 111, 98-136(2015).

    [47] Chen Xinlei, Fang Hao, Lin T Y, et al. Microsoft COCO captions: data collection evaluation server[DBOL]. arXiv preprint arXiv: 1504.00325, 2015.

    [48] Wu Fan. AutoLabelImg[EBOL]. [2020]. https:github.comwufantbAutoLabelImg.

    [49] Zhou Xingyi, Koltun V, Krähenbühl P. Tracking objects as points[C]Proceedings of the 16th European Conference on Computer Vision. 2020: 474490.

    [50] Liu Li, Ouyang Wanli, Wang Xiaogang et al. Deep learning for generic object detection: a survey[J]. International Journal of Computer Vision, 128, 261-318(2020).

    [51] Li Ruihuang, He Chenhang, Zhang Yabin, et al. SIM: semanticaware instance mask generation f boxsupervised instance segmentation[C]Proceedings of 2023 IEEECVF Conference on Computer Vision Pattern Recognition. 2023: 71937203.

    [52] Chen Li, Li Linhan, Wang Shiyong. MMShip: medium resolution multispectral satellite imagery ship dataset[J]. Optics and Precision Engineering, 31, 1962-1972(2023).

    [53] Chen Zizhao, Qian Yeqiang, Yang Xiaoxiao, et al. AMFD: distillation via adaptive multimodal fusion f multispectral pedestrian detection[DBOL]. arXiv preprint arXiv: 2405.12944, 2024.

    [54] Salomonson V V, Barnes W L, Maymon P W et al. MODIS: advanced facility instrument for studies of the Earth as a system[J]. IEEE Transactions on Geoscience and Remote Sensing, 27, 145-153(1989).

    [55] Yan Yunbin, Cui Bolun, Yang Tingting. Multi-modal high-resolution hyperspectral object detection system based on lightweight platform[J]. Infrared Technology, 45, 582-591(2023).

    [56] Jia Jianxin, Wang Yueming, Zheng Xiaou, et al. Design, perfmance, applications of AMMIS: a novel airbne multimodular imaging spectrometer f highresolution earth observations[J]. Engineering, doi: 10.1016j.eng.2024.11.001.

    [57] Wan Yuanqing, Liu Weijun, Lin Ruoyu. Research progress and applications of spectral imaging based on metasurfaces[J]. Opto-Electronic Engineering, 50, 230139(2023).

    [58] Xue Qingsheng, Bai Haoxuan, Lu Fengqin. Development of snapshot hyperspectral imager based on microlens array[J]. Acta Photonica Sinica, 52, 0552223(2023).

    [59] Wang Juntong, Yang Huadong. Camouflaged target recognition technology based on hyperspectral unmixing[J]. Semiconductor Optoelectronics, 45, 261-268(2024).

    [60] Zou Zhengxia, Chen Keyan, Shi Zhenwei et al. Object detection in 20 years: a survey[J]. Proceedings of the IEEE, 111, 257-276(2023).

    [61] Lowe D G. Object recognition from local scaleinvariant features[C]Proceedings of the 7th IEEE International Conference on Computer Vision. 1999: 11501157.

    [62] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 60, 91-110(2004).

    [63] Bay H, Tuytelaars T, Van Gool L. SURF: speeded up robust features[C]Proceedings of the 9th European Conference on Computer Vision. 2006: 404417.

    [64] Song Yanyan, Lu Ying. Decision tree methods: applications for classification and prediction[J]. Shanghai Archives of Psychiatry, 27, 130-135(2015).

    [65] Jijo B T, Abdulazeez A M. Classification based on decision tree algorithm for machine learning[J]. Journal of Applied Science and Technology Trends, 2, 20-28(2021).

    [66] Abdullah D M, Abdulazeez A M. Machine learning applications based on SVM classification: a review[J]. Qubahan Academic Journal, 1, 81-90(2021).

    [67] Viola P, Jones M. Rapid object detection using a boosted cade of simple features[C]Proceedings of 2001 IEEE Computer Society Conference on Computer Vision Pattern Recognition. 2001: I.

    [68] Viola P, Jones M J. Robust real-time face detection[J]. International Journal of Computer Vision, 57, 137-154(2004).

    [69] Papagegiou C P, en M, Poggio T. A general framewk f object detection[C]Proceedings of the Sixth International Conference on Computer Vision. 1998: 555562.

    [70] Dalal N, Triggs B. Histograms of iented gradients f human detection[C]Proceedings of 2005 IEEE Computer Society Conference on Computer Vision Pattern Recognition. 2005: 886893.

    [71] Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, defmable part model[C]Proceedings of 2008 IEEE Conference on Computer Vision Pattern Recognition. 2008: 18.

    [72] Zhang Tianwen, Zhang Xiaoling, Ke Xiao et al. HOG-ShipCLSNet: a novel deep learning network with hog feature fusion for SAR ship classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 60, 5210322(2022).

    [73] Tang Zetian, Zhang Zemin, Chen Wei et al. An SIFT-based fast image alignment algorithm for high-resolution image[J]. IEEE Access, 11, 42012-42041(2023).

    [74] Paszke A, Gross S, Massa F, et al. PyTch: an imperative style, highperfmance deep learning library[C]Proceedings of the 33rd Conference on Neural Infmation Processing Systems. 2019: 32.

    [75] Abadi M, Agarwal A, Barham P, et al. TensFlow: largescale machine learning on heterogeneous distributed systems[DBOL]. arXiv preprint arXiv: 1603.04467, 2016.

    [76] Krizhevsky A, Sutskever I, Hinton G E. Image classification with deep convolutional neural wks[C]Proceedings of the 26th International Conference on Neural Infmation Processing Systems. 2012: 10971105.

    [77] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies f accurate object detection semantic segmentation[C]Proceedings of 2014 IEEE Conference on Computer Vision Pattern Recognition. 2014: 580587.

    [78] Uijlings J R R, Van De Sande K E A, Gevers T et al. Selective search for object recognition[J]. International Journal of Computer Vision, 104, 154-171(2013).

    [79] He Kaiming, Zhang Xiangyu, Ren Shaoqing et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904-1916(2015).

    [80] Girshick R. Fast RCNN[C]Proceedings of 2015 IEEE International Conference on Computer Vision. 2015: 14401448.

    [81] Lin T Y, Dollár P, Girshick R, et al. Feature pyra wks f object detection[C]Proceedings of 2017 IEEE Conference on Computer Vision Pattern Recognition. 2017: 936944.

    [82] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, realtime object detection[C]Proceedings of 2016 IEEE Conference on Computer Vision Pattern Recognition. 2016: 779788.

    [83] Liu Wei, Anguelov D, Erhan D, et al. SSD: single shot multibox detect[C]Proceedings of the 14th European Conference on Computer Vision. 2016: 2137.

    [84] Lin T Y, Goyal P, Girshick R, et al. Focal loss f dense object detection[C]Proceedings of 2017 IEEE International Conference on Computer Vision. 2017: 29993007.

    [85] Law H, Deng Jia. Cner: detecting objects as paired keypoints[C]Proceedings of the 15th European Conference on Computer Vision. 2018: 765781.

    [86] Carion N, Massa F, Synnaeve G, et al. Endtoend object detection with transfmers[C]Proceedings of the 16th European Conference on Computer Vision. 2020: 213229.

    [87] Zhu Xizhou, Su Weijie, Lu Lewei, et al. Defmable DETR: defmable transfmers f endtoend object detection[C]Proceedings of the 9th International Conference on Learning Representations. 2021.

    [88] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]Proceedings of the 31st International Conference on Neural Infmation Processing Systems. 2017: 60006010.

    [89] Han Dongchen, Pan Xuran, Han Yizeng, et al. Flatten transfmer: vision transfmer using focused linear attention[C]Proceedings of 2023 IEEECVF International Conference on Computer Vision. 2023: 59385948.

    [90] Yao Ting, Li Yehao, Pan Yingwei et al. Dual vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 10870-10882(2023).

    [91] Gu A, Dao T. Mamba: lineartime sequence modeling with ive state spaces[DBOL]. arXiv preprint arXiv: 2312.00752, 2024.

    [92] Zhu Lianghui, Liao Bencheng, Zhang Qian, et al. Vision mamba: efficient visual representation learning with bidirectional state space model[C]Proceedings of the 41st International Conference on Machine Learning. 2024.

    [93] He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning f image recognition[C]Proceedings of 2016 IEEE Conference on Computer Vision Pattern Recognition. 2016: 770778.

    [94] González A, Fang Zhijie, Socarras Y et al. Pedestrian detection at day/night time with visible and FIR cameras: a comparison[J]. Sensors, 16, 820(2016).

    [95] Fang Qingyun, Han Dapeng, Wang Zhaokui. Crossmodality fusion transfmer f multispectral object detection[DBOL]. arXiv preprint arXiv: 2111.00273, 2022.

    [96] Redmon J, Farhadi A. YOLOv3: an incremental improvement[DBOL]. arXiv preprint arXiv: 1804.02767, 2018.

    [97] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[C]Proceedings of the 34th International Conference on Neural Infmation Processing Systems. 2020: 574.

    [98] Rombach R, Blattmann A, Lenz D, et al. Highresolution image synthesis with latent diffusion models[C]Proceedings of 2022 IEEECVF Conference on Computer Vision Pattern Recognition. 2022: 1067410685.

    [99] Zhao Tianyi, Yuan Maoxun, Jiang Feng, et al. Removal ion: improving RGBinfrared object detection via coarsetofine fusion[DBOL]. arXiv preprint arXiv: 2401.10731, 2024.

    [100] Jacobs R A, Jordan M I, Nowlan S J et al. Adaptive mixtures of local experts[J]. Neural Computation, 3, 79-87(1991).

    [101] Shazeer N, Mirhoseini A, Maziarz K, et al. Outrageously large neural wks: the sparselygated mixtureofexperts layer[C]Proceedings of the 5th International Conference on Learning Representations. 2017.

    [102] SohlDickstein J, Weiss E A, Maheswaranathan N, et al. Deep unsupervised learning using nonequilibrium thermodynamics[C]Proceedings of the 32nd International Conference on Machine Learning. 2015: 22562265.

    [103] Ono S. Snapshot multispectral imaging using a pixel-wise polarization color image sensor[J]. Optics Express, 28, 34536-34573(2020).

    [104] Hubold M, Montag E, Berlich R et al. Multi-aperture system approach for snapshot multispectral imaging applications[J]. Optics Express, 29, 7361-7378(2021).

    [105] Mengu D, Tabassum A, Jarrahi M, et al. Snapshot multispectral imaging using a diffractive optical wk[J]. Light: Science & Applications, 2023, 12: 86.

    [106] Wang Xudong, Girdhar R, Yu S X, et al. Cut learn f unsupervised object detection instance segmentation[C]Proceedings of 2023 IEEECVF Conference on Computer Vision Pattern Recognition. 2023: 31243134.

    Tianlong Zhang, Yuanchao Geng, Yuzhen Liao, Dangpeng Xu. A review of multispectral target detection algorithms and related datasets[J]. High Power Laser and Particle Beams, 2025, 37(5): 051001
    Download Citation