• Spectroscopy and Spectral Analysis
  • Vol. 43, Issue 1, 289 (2023)
ZHU Wen-qing1、2、3、*, ZHANG Ning1、2、3, LI Zheng1、2、3, LIU Peng1、3, and TANG Xin-yi1、3
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • show less
    DOI: 10.3964/j.issn.1000-0593(2023)01-0289-08 Cite this Article
    ZHU Wen-qing, ZHANG Ning, LI Zheng, LIU Peng, TANG Xin-yi. A Multi-Task Convolutional Neural Network for Infrared and Visible Multi-Resolution Image Fusion[J]. Spectroscopy and Spectral Analysis, 2023, 43(1): 289 Copy Citation Text show less

    Abstract

    Infrared and visible image fusion have always been a research hotspot in the image field. Fusion technology can compensate for a single sensor’s deficiency and provide good imaging pandation for image understanding and analysis. Due to the limitation of production technology and cost, the resolution of infrared detectors is much lower than that of visible detectors, which prevents practical usage to a great extent. A multi-task convolutional neural network framework combining infrared super-resolution and image fusion tasks is proposed, which is applied to the infrared and visible multi-resolution image fusion. In terms of network structure, firstly, a dual-channel network is designed to extract infrared and visible features respectively, so that the resolution of each source image does not limit the proposed algorithm. Secondly, the feature up-sampling block is proposed, using the bilinear interpolation method to increase the number of pixels. Then the mapping relationship between pixel smooth space and high-frequency space is refined via a multilayer perceptron. Therefore, the infrared images can be presented on an arbitrary scale, where the training tasks are not provided. Furthermore, the linear self-attention mechanism is introduced into the network to learn the nonlinear relationship between feature space positions, suppress irrelevant information and enhance global information expression. In terms of the loss function, the gradient loss is proposed to retain the filter response with larger absolute values in the infrared and visible images and calculate the Frobenius norm between the value and the response value of the reconstructed fusion image. Thus, fusion images can be generated without ideal images as ground truth supervising network learning. Finally, the fused and high-resolution infrared images can be reconstructed simultaneously by optimizing the multi-task model under the combined action of gradient loss and pixel loss. The proposed approach is trained on the RoadScene dataset and compared with the other four related algorithms on the TNO dataset. In terms of subjective performance, the proposed method can input source images with the arbitrary resolution, and fusion images have prominent infrared targets and rich visible details. When the resolution of source images is quite different, the proposed method can still reconstruct high-resolution infrared images with clear features and has robust generalization. The objective performance is excellent in multiple evaluation metrics such as entropy, the sum of the correlations of differences and spatial frequency. Experimental results demonstrate that fusion images have a large amount of information, high information conversion rate and high clarity, which verifies the effectiveness of the proposed method.
    ZHU Wen-qing, ZHANG Ning, LI Zheng, LIU Peng, TANG Xin-yi. A Multi-Task Convolutional Neural Network for Infrared and Visible Multi-Resolution Image Fusion[J]. Spectroscopy and Spectral Analysis, 2023, 43(1): 289
    Download Citation