Remote Sensing Image Fusion Method Based on Improved Swin Transformer

Zitong LI; Jiankang ZHAO; Jingran XU; Haihui LONG; Chuanqi LIU

doi:10.3788/gzxb20235211.1110001

Journals >Acta Photonica Sinica >Volume 52 >Issue 11 >Page 1110001 > Article

Acta Photonica Sinica
Vol. 52, Issue 11, 1110001 (2023)

Remote Sensing Image Fusion Method Based on Improved Swin Transformer

Zitong LI, Jiankang ZHAO^*, Jingran XU, Haihui LONG, and Chuanqi LIU

Author Affiliations

School of Electronic Information and Electrical Engineering，School of Perceptual Science and Engineering，Shanghai Jiao Tong University，Shanghai 200240，China

show less

DOI: 10.3788/gzxb20235211.1110001 Cite this Article

Zitong LI, Jiankang ZHAO, Jingran XU, Haihui LONG, Chuanqi LIU. Remote Sensing Image Fusion Method Based on Improved Swin Transformer[J]. Acta Photonica Sinica, 2023, 52(11): 1110001 Copy Citation Text

show less

Fig. 1. Overall network structure

Download full size | View in the Article

Fig. 2. Detail injection model

Download full size | View in the Article

Fig. 3. Multi-scale CNN and channel attention module

Download full size | View in the Article

Fig. 4. Structure of feature reconstruction network

Download full size | View in the Article

Fig. 5. Fusion result of WorldView-4 simulation dataset

Download full size | View in the Article

Fig. 6. Residual graph of WorldView-4 simulation dataset

Download full size | View in the Article

Fig. 7. Fusion result of QuickBird simulation dataset

Download full size | View in the Article

Fig. 8. Residual graph of QuickBird simulation dataset

Download full size | View in the Article

Fig. 9. Fusion result of WorldView-2 simulation dataset

Download full size | View in the Article

Fig. 10. Residual graph of WorldView-2 simulation dataset

Download full size | View in the Article

Fig. 11. Fusion result of WorldView-4 real dataset

Download full size | View in the Article

Fig. 12. Three different window attention unit structures

Download full size | View in the Article

1	for i in epochs			第i个epoch，最大epoch个数设为200
2		for j in batches		第j个batch
3			Select 32 patches of PAN images；	选取PAN数据集的32张图像；
4			Select 32 patches of LRMS images；	选取LRMS数据集的32张图像；
5			Select 32 patches of HRMS images；	选取HRMS数据集的32张图像；
6			Produce the output $\overset{̑}{P} = f (P A N, L R M S)$ ；	计算模型生成的融合图像；
7			Calculate the loss $L$ ；	计算融合图像和参考图像的损失函数 $L$ ；
8			Update parameters by AdamOptimizer；	根据 $L$ ，利用Adam优化器更新模型的参数；
9		end
10	end

Table 0. [in Chinese]

View in the Article

		Training dataset		Testing dataset（reduced resolution）		Testing dataset（full resolution）
		Number	Size	Number	Size	Number	Size
WV4	LRMS	22 000	16×16×4	50	64×64×4	50	256×256×4
	PAN	22 000	64×64×1	50	256×256×1	50	1 024×1 024×1
	HRMS	22 000	64×64×4	50	256×256×4	-	-
QB	LRMS	22 000	16×16×4	50	64×64×4	50	256×256×4
	PAN	22 000	64×64×1	50	256×256×1	50	1 024×1 024×1
	HRMS	22 000	64×64×4	50	256×256×4	-	-
WV2	LRMS	22 000	16×16×8	50	64×64×8	50	256×256×8
	PAN	22 000	64×64×1	50	256×256×1	50	1 024×1 024×1
	HRMS	22 000	64×64×8	50	256×256×8	-	-

Table 1. Specific information about the dataset

View in the Article

	WV4				QB				WV2
Method	ERGAS↓	SAM↓	PSNR↑	SCC↑	ERGAS↓	SAM↓	PSNR↑	SCC↑	ERGAS↓	SAM↓	PSNR↑	SCC↑
MTF-GLP	6.340	5.772	23.524	0.914	2.698	2.334	37.271	0.857	6.338	7.699	26.891	0.878
Wavelet	6.425	6.460	23.401	0.864	4.316	2.981	32.160	0.660	6.703	8.435	26.096	0.845
PCA	6.505	7.337	23.326	0.878	2.981	3.162	36.584	0.792	7.881	8.842	25.081	0.828
IHS	5.661	5.394	24.486	0.902	2.826	2.573	36.048	0.723	6.454	7.780	26.628	0.876
MSDCNN	2.811	3.232	30.590	0.973	1.359	1.468	43.334	0.953	4.036	5.145	30.938	0.944
FusionNet	2.910	3.190	30.280	0.972	1.270	1.369	43.856	0.959	3.845	5.050	31.217	0.948
Panformer	2.820	3.170	30.677	0.975	1.251	1.362	44.077	0.961	3.888	5.013	31.229	0.948
LAGConv	2.693	3.110	30.956	0.976	1.272	1.406	43.813	0.958	3.878	5.070	31.140	0.947
TFNet	2.585	3.115	31.390	0.978	1.238	1.344	44.154	0.961	3.795	5.003	31.397	0.950
MSCANet	2.275	2.831	32.478	0.982	1.233	1.310	44.202	0.962	3.665	4.869	31.691	0.953

Table 2. Objective evaluation index of simulation dataset

View in the Article

Method	WV4			QB			WV2
Method	$D_{λ} ↓$	$D_{S}$ ↓	QNR↑	$D_{λ}$ ↓	$D_{S}$ ↓	QNR↑	$D_{λ}$ ↓	$D_{S}$ ↓	QNR↑
MTF-GLP	0.065 5	0.050 9	0.887 1	0.095 7	0.150 9	0.768 9	0.094 2	0.065 3	0.847 0
Wavelet	0.014 1	0.039 8	0.946 7	0.133 5	0.151 4	0.738 2	0.046 9	0.073 1	0.883 6
PCA	0.034 8	0.064 7	0.902 8	0.016 4	0.083 9	0.901 1	0.069 5	0.056 8	0.877 6
IHS	0.013 3	0.067 0	0.920 6	0.018 0	0.091 9	0.891 8	0.025 9	0.047 2	0.928 2
MSDCNN	0.024 0	0.016 4	0.960 0	0.013 2	0.034 0	0.953 3	0.018 0	0.046 2	0.936 6
FusionNet	0.027 8	0.026 4	0.946 5	0.014 1	0.029 1	0.957 3	0.017 2	0.031 6	0.951 8
Panformer	0.040 0	0.018 8	0.942 0	0.015 1	0.037 3	0.948 2	0.020 3	0.031 4	0.948 9
LAGConv	0.030 6	0.018 9	0.951 1	0.014 9	0.054 1	0.931 8	0.017 7	0.029 5	0.953 4
TFNet	0.019 9	0.026 1	0.954 5	0.015 4	0.041 6	0.943 6	0.014 9	0.048 7	0.937 2
MSCANet	0.018 1	0.008 8	0.973 2	0.011 0	0.031 1	0.958 2	0.015 5	0.023 4	0.961 5

Table 3. Objective evaluation index of real dataset

View in the Article

Model	ERGAS $↓$	SAM $↓$	PSNR↑	SCC↑
Non-injection model	2.412	2.926	31.944	0.980
Injection model	2.268	2.816	32.488	0.983

Table 4. Ablation result of injection model in WV4 dataset

View in the Article

Structure	MLP	Multi-scale CNN	Channel-attention	ERGAS $↓$	SAM $↓$	PSNR↑	SCC↑
Fig.12（a）	√			2.767	3.156	30.763	0.974
Fig.12（b）		√		2.377	2.916	32.100	0.981
Fig.12（c）		√	√	2.268	2.816	32.488	0.983

Table 5. Ablation result of MSCA in WV4 dataset

View in the Article

MAE	Spectral loss	Spatial loss	ERGAS $↓$	SAM $↓$	PSNR↑	SCC↑
√			2.362	2.873	32.109	0.981
√	√		2.343	2.845	32.211	0.981
√		√	2.329	2.900	32.261	0.982
√	√	√	2.268	2.816	32.488	0.983

Table 6. Ablation result of loss function in WV4 dataset

View in the Article

Method	Runtime/s	Parameters
MTF-GLP	0.919	-
Wavelet	0.095	-
PCA	0.122	-
IHS	0.105	-
MSDCNN	0.046	$0.19 \times 10^{6}$
FusionNet	0.053	$0.15 \times 10^{6}$
Panformer	0.197	$1.85 \times 10^{6}$
LAGConv	0.079	$0.05 \times 10^{6}$
TFNet	0.125	$2.36 \times 10^{6}$
MSCANet	0.147	$1.99 \times 10^{6}$

Table 7. Average test time and number of parameters for all methods

Zitong LI, Jiankang ZHAO, Jingran XU, Haihui LONG, Chuanqi LIU. Remote Sensing Image Fusion Method Based on Improved Swin Transformer[J]. Acta Photonica Sinica, 2023, 52(11): 1110001

Download Citation

Tools

Save the article for my favorites

Paper Information