• Infrared and Laser Engineering
  • Vol. 50, Issue 9, 20200467 (2021)
Ying Shen, Chunhong Huang, Feng Huang, Jie Li, Mengjiao Zhu, and Shu Wang*
Author Affiliations
  • College of Mechanical Engineering and Automation, Fuzhou University, Fuzhou 350116, China
  • show less
    DOI: 10.3788/IRLA20200467 Cite this Article
    Ying Shen, Chunhong Huang, Feng Huang, Jie Li, Mengjiao Zhu, Shu Wang. Research progress of infrared and visible image fusion technology[J]. Infrared and Laser Engineering, 2021, 50(9): 20200467 Copy Citation Text show less
    Multi-scale transform-based infrared and visible image fusion frame
    Fig. 1. Multi-scale transform-based infrared and visible image fusion frame
    Sparse representation-based infrared and visible image fusion frame
    Fig. 2. Sparse representation-based infrared and visible image fusion frame
    Difference between visible light, infrared and polarization for face recognition[3]
    Fig. 3. Difference between visible light, infrared and polarization for face recognition[3]
    Image fusion processing framework for fruit detection[9]
    Fig. 4. Image fusion processing framework for fruit detection[9]
    Application of natural color mapping in image fusion[111]
    Fig. 5. Application of natural color mapping in image fusion[111]
    Effect of nine representative methods on the fusion of infrared and visible images
    Fig. 6. Effect of nine representative methods on the fusion of infrared and visible images
    Objective evaluation metrics with different methods
    Fig. 7. Objective evaluation metrics with different methods
    Fusion methodsSpecific methodsFusion strategiesAdvantagesLimitationsApplicable scenes
    Pyramid transformsLaplacian pyramidFuzzy logic[9]Smoothing image edge; Less time consumption; Less artifacts Losing image details; Block phenomenon; Redundancy of data Short-distance scenes with sufficient light, such as equipment detection
    Contrast pyramidClonal selection algorithm[17]Teaching learning based optimization[88]Multi-objective evolutionary algorithm[89]High image contrast; Abundant characteristic information Low computing efficiency; Losing image details
    Steerable pyramidThe absolute value maximum selection(AVMS)[90]; The expectation maximization(EM) algorithm[91]; PCNN and weighting[92]Abundant edge detail; Inhibiting the Gibbs effect effectively; Fusing the geometrical and thematic feature availably Increasing the complexity of algorithm; Losing the image details
    Wavelet transformDiscrete wavelet transformRegional energy[93]Target region segmentation[21]Significant texture information; Highly independent scale information; Less blocking artifacts; Higher signal-to-noise ratios Image aliasing; Ringing artifacts; Strict registration requirements Short-distance scenes, such as face recognition
    Dual tree discrete wavelet transformParticle swarm optimization[22] Fuzzy logic and population-based optimization[94]Less redundant information; Less time consumption Limited directional information
    Lifting wavelet transformLocal regional energy[23]PCNN[85]High computing speed; Low space complexity; Losing image details; Distorting image
    Nonsubsampled multi-scale and multi-direction geometrical transformNSCTFuzzy logic[29]Region of interest[30]Distinct edge features; Eliminating the Gibbs effect; Better visual perception Losing image details; Low computing efficiency; Poor real-time Scenes with a complex background, such as rescue scenes
    NSSTRegion average energy and local directional contrast[33]FNMF[34]Superior sparse ability; High real-time performance Losing luminance information; Strict registration requirement; Losing image details of high frequency Cases need real-time treatment, such as intelligent traffic monitoring
    Sparse representationSaliency detection[44, 86-87]PCNN[56, 95]Better robustness; Less artifacts; Reducing misregistration; Abundant brightness information Smoothing edge texture information; Complex calculation; Losing edge features of high frequency images Scenes with little feature points, such as the surface of the sea
    Table 1. Comparison of infrared and visible image fusion methods
    续表 1Tab.1 Continued
    Neural networkPCNNMulti-scale transform and sparse representation; Multi-scale transform Superior adaptability; Higher signal-to-noise ratios; High fault tolerance Model parameters are not easy to set; Complex and time-consuming algorithms Automatic target detection and localization
    Deep learningVGG-19 and multi-layer fusion[69]; VGG-19 and saliency detection[70]Less artificial noise; Abundant characteristic information Less artifacts Requiring the ground truth in advance
    GAN[71]Avoiding manually designing complicated activity level measurements and fusion rulesThe visual information fidelity and correlation coefficient is not optimal
    Hybrid methodsMulti-scale transform and saliencyWeight calculation[76-80]; Salient object extraction[81, 82]Maintaining the integrity of the salient object region; Improving the visual quality of the fused image;Reducing the noise Highlighting saliency area inconsistently; Losing the background information The surveillance application, such as object detection and tracking
    Multi-scale transform and SRThe absolute values of coefficient and SR[38]; The fourth-order correlation coefficients match and SR[83]Retaining luminance information; Excellent stability and robustness Poor real-time Losing the image details
    Table 1. [in Chinese]
    Evaluation indicatorsDefinitionExplanation
    IE[124]${{IE} } = - \displaystyle\sum\limits_{i = 0}^{L - 1} { {p_i} } {\log _2}{p_i}$Amount of information contained in an image increases as IE improves
    SD[125]${{SD} } = \sqrt {\frac{1}{ {MN} }\displaystyle\mathop \sum \limits_{i = 1}^M \displaystyle\mathop \sum \limits_{j = 1}^N { {\left( {F\left( {i,j} \right) - \mu } \right)}^2} }$Deviation between pixels and pixel mean is evaluated by SD, which improves with the increase of SD, resulting in improvement in contrast of images
    AG[126]${{AG} } = \frac{1}{ {\left( {M - 1} \right)\left( {N - 1} \right)} }\displaystyle\sum\limits_{i = 1}^{M - 1} {\displaystyle\sum\limits_{j = 1}^{N - 1} {\sqrt {\frac{ {\left( {\vartriangle Z_i^2 + \vartriangle Z_j^2} \right)} }{2} } } }$A wealth of detailed information is exhibited by a high value of AG which is used to reflect the gray variation of the image
    QAB/F[127]${ {{Q} }^{{ {AB/F} } } } = \frac{ {\displaystyle\sum\limits_{i = 0}^{M - 1} {\displaystyle\sum\limits_{j = 0}^{N - 1} {\left( {Q_{\left( {i,j} \right)}^{AF}w_{\left( {i,j} \right)}^A + Q_{\left( {i,j} \right)}^{BF}w_{\left( {i,j} \right)}^B} \right)} } } }{ {\displaystyle\sum\limits_{i = 0}^{M - 1} {\displaystyle\sum\limits_{j = 0}^{N - 1} {\left( {w_{\left( {i,j} \right)}^A + w_{\left( {i,j} \right)}^B} \right)} } } }$Fusion effect of image exhibits better as the value of QAB/F which is used to evaluate the transfer of edge information, approaches 1
    MI[2]$\begin{array}{l}{I_{ { {FA} } } }(i,j) = \displaystyle\sum\limits_{i = 1}^{M - 1} {\displaystyle\sum\limits_{j = 1}^{N - 1} { {P_{ { {FA} } } }\left( {i,j} \right)} } {\log _2}\dfrac{ { {P_{FA} }\left( {i,j} \right)} }{ { {P_F}\left( i \right){P_B}\left( j \right)} }\\MI_{AB}^F = {I_{ { {FA} } } } + {I_{ { {FB} } } }\end{array}$Amount of information preserved in an image increases with the improvement of MI which is utilized to characterize inheritance of image information
    CC[128]${{CC} } = \frac{ {\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ {\left( {F\left( {i,j} \right) - {\mu _F} } \right) \times \left( {S\left( {i,j} \right) - {\mu _S} } \right)} \right]} } } }{ {\sqrt {\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ { { {\left( {F\left( {i,j} \right) - {\mu _F} } \right)}^2} } \right]\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ { { {\left( {S\left( {i,j} \right) - {\mu _S} } \right)}^2} } \right]} } } } } } }$Similarity between images improves as CC increases, thereby preserving more image information
    Table 2. Evaluation index without reference image
    Evaluation indicatorsDefinitionExplanation
    SSIM[129]$SSI{M_{RF}} = \displaystyle\prod\limits_{i = 1}^3 {\dfrac{{2{\mu _R}{\mu _F} + {c_i}}}{{\mu _R^2 + \mu _F^2 + {c_i}}}} $Similarity between source image and fusion image enhances with the increase of SSIM which is used to measure image luminance, contrast and structural distortion level
    RMSE[2]$RMSE = \sqrt {\dfrac{1}{{M \times N}}\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {{{\left[ {R\left( {i,j} \right) - F\left( {i,j} \right)} \right]}^2}} } } $Performance indicators of images promote with the reduction of RMSE
    PSNR[2]$PSNR = 10 \cdot \lg \dfrac{{{{\left( {255^2 \times M \times N} \right)}}}}{{\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {{{\left[ {R\left( {i,j} \right) - F\left( {i,j} \right)} \right]}^2}} } }}$The distortion of images decreases as the improvement of PSNR using to evaluate whether the image noise is suppressed
    Table 3. Evaluation index based on reference image
    Ying Shen, Chunhong Huang, Feng Huang, Jie Li, Mengjiao Zhu, Shu Wang. Research progress of infrared and visible image fusion technology[J]. Infrared and Laser Engineering, 2021, 50(9): 20200467
    Download Citation