• Acta Photonica Sinica
  • Vol. 51, Issue 2, 0210002 (2022)
Penghui ZHANG1, Zhi LIU2, Jianyong ZHENG3, Boxia HE1、*, and Yuhao PEI1
Author Affiliations
  • 1School of Mechanical Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
  • 2Nanjing Planetech Intelligent Technology Co.,LTD,Nanjing 210014,China
  • 3Institute of Artificial Intelligence,Shanghai University,Shanghai 200444,China
  • show less
    DOI: 10.3788/gzxb20225102.0210002 Cite this Article
    Penghui ZHANG, Zhi LIU, Jianyong ZHENG, Boxia HE, Yuhao PEI. Real-time Infrared Target Detection Algorithm for Embedded System in Complex Scene[J]. Acta Photonica Sinica, 2022, 51(2): 0210002 Copy Citation Text show less

    Abstract

    In order to solve the problems of low accuracy and recall rate of infrared target detection under complex background conditions, as well as slow inference speed of network model on embedded computing platform, lightweight network YOLOv4-Tiny was taken as the basic architecture of the algorithm, combined with visual attention mechanism and spatial pyramid pooling idea. Two real-time infrared target detection networks for embedded systems are proposed. Among them, there are a lot of background interference information in target detection in infrared complex scenes. Therefore, the visual attention mechanism is used to effectively learn the weight distribution of the feature map, recalibrate the feature map, strengthen the focus on the target, reduce the influence of irrelevant background information and improve the detection and recognition ability of the model. Spatial pyramid pooling can fuse multi-scale features, enrich the information of feature maps and improve the ability of infrared target recognition and location at different scales. Grad-CAM was used to visualize the feature map strengthened by the attention mechanism, showing the attention of the network model to the target region. The training is carried out on a 2080Ti GPU computer platform using the transfer learning strategy, and deployed on the Atlas 200 DK embedded computing platform with Ascend 310 AI chip as the core. The experimental results show that compared with the original network YOLOv4-Tiny, the infrared images with a resolution of 640 pixels × 512 pixels are detected on the computer platform. The average accuracy and recall rate of the proposed YOLOv4-Tiny+SE+SPP network were improved by 13.96% and 20.14%, respectively, and the inference speed reached 212 FPS. The average accuracy and recall rate of the proposed YOLOv4-Tiny+CBAM+SPP network were improved by 15.75% and 22.41%, respectively, and the inference speed reached 202 FPS. On Atlas 200 DK embedded computing platform, infrared images with a resolution of 640 pixel×512 pixel are detected, compared with the original network YOLOv4-Tiny. The average accuracy and recall rate of the proposed YOLOv4-Tiny+SE+SPP network were improved by 12.36% and 18.6%, respectively, and the inference speed reached 78 FPS. The average accuracy and recall rate of the proposed network YOLOv4-Tiny+CBAM+SPP are improved by 15.94% and 22.89%, respectively, and the inference speed reaches 71 FPS, which can meet the needs of real-time detection and tracking of infrared targets in military and security fields.
    Penghui ZHANG, Zhi LIU, Jianyong ZHENG, Boxia HE, Yuhao PEI. Real-time Infrared Target Detection Algorithm for Embedded System in Complex Scene[J]. Acta Photonica Sinica, 2022, 51(2): 0210002
    Download Citation