Infrared and Visible Image Fusion Method via Interactive Self-attention

Fan YANG; Zhishe WANG; Jing SUN; Zhaofa YU

doi:10.3788/gzxb20245306.0610003

Journals >Acta Photonica Sinica >Volume 53 >Issue 6 >Page 0610003 > Article

Acta Photonica Sinica
Vol. 53, Issue 6, 0610003 (2024)

Infrared and Visible Image Fusion Method via Interactive Self-attention

Fan YANG¹, Zhishe WANG^1,*, Jing SUN¹, and Zhaofa YU²

Author Affiliations

¹School of Applied Science， Taiyuan University of Science and Technology， Taiyuan 030024， China

²Ordnance NCO Academy， Army Engineering University of PLA， Wuhan 430075， China

show less

DOI: 10.3788/gzxb20245306.0610003 Cite this Article

Fan YANG, Zhishe WANG, Jing SUN, Zhaofa YU. Infrared and Visible Image Fusion Method via Interactive Self-attention[J]. Acta Photonica Sinica, 2024, 53(6): 0610003 Copy Citation Text

show less

Fig. 1. The framework of interactive self-attention fusion method

Download full size | View in the Article

Fig. 2. The frameworks of Token ViT， Channel ViT and vision transformer

Download full size | View in the Article

Fig. 3. The subjective comparison results of different fusion models

Download full size | View in the Article

Fig. 4. The subjective comparison of “Nato_camp”， “Kaptrin_1123”and “Street” selected from the TNO dataset

Download full size | View in the Article

Fig. 5. The objective comparison results of different methods for the TNO dataset

Download full size | View in the Article

Fig. 6. The subjective comparison of “00443”and “03989” selected from the M³FD dataset

Download full size | View in the Article

Fig. 7. The objective comparison results of different methods for the M³FD dataset

Download full size | View in the Article

Fig. 8. The subjective comparison of “FILR_08910” and “FILR_06307” selected from the Roadscene dataset

Download full size | View in the Article

Fig. 9. The objective comparison results of different methods for the Roadscene dataset

Download full size | View in the Article

Methods	AG	MI	PC	FMIp	Q_e	Q_abf	MS_SSIM	VIF
w/o Trans	4.365 4	2.929 2	0.344 8	0.910 4	0.361 6	0.429 2	0.864 8	0.399 6
w/o CNN	4.731 7	2.474 0	0.240 6	0.893 6	0.324 3	0.444 0	0.910 3	0.390 0
w/o Channel	4.916 4	2.852 8	0.358 0	0.911 1	0.507 8	0.554 9	0.924 9	0.430 9
w/o Token	5.296 4	2.925 4	0.372 8	0.909 9	0.501 4	0.575 2	0.912 6	0.447 1
with PE	5.221 1	3.288 4	0.390 5	0.908 6	0.502 2	0.604 2	0.904 4	0.447 5
WP1	5.035 7	3.177 4	0.346 1	0.901 5	0.483 5	0.518 8	0.883 6	0.435 7
WP2	5.584 3	3.095 9	0.360 4	0.903 8	0.425 8	0.511 1	0.883 0	0.446 5
Ours	5.492 1	3.358 1	0.393 5	0.910 5	0.511 7	0.609 5	0.911 9	0.447 7

Table 1. The objective comparison results of different fusion models

View in the Article

Methods	U2Fusion	RFN-Nest	FusionGAN	GANMcC	YDTR	SwinFusion	SwinFuse	Ours
TNO	1.515	0.235	0.513	0.785	0.201	2.312	0.223	0.210
M3FD	4.646	0.864	0.988	1.257	0.771	6.257	0.946	0.833
Roadscene	0.932	0.170	0.563	1.014	0.087	1.564	0.145	0.096

Table 2. The comparison results of computational efficiency for different fusion methods（units：s）

Fan YANG, Zhishe WANG, Jing SUN, Zhaofa YU. Infrared and Visible Image Fusion Method via Interactive Self-attention[J]. Acta Photonica Sinica, 2024, 53(6): 0610003

Download Citation

Tools

Save the article for my favorites

Paper Information

微信扫一扫：分享

微信扫一扫：分享