• Acta Photonica Sinica
  • Vol. 52, Issue 11, 1110001 (2023)
Zitong LI, Jiankang ZHAO*, Jingran XU, Haihui LONG, and Chuanqi LIU
Author Affiliations
  • School of Electronic Information and Electrical Engineering,School of Perceptual Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China
  • show less
    DOI: 10.3788/gzxb20235211.1110001 Cite this Article
    Zitong LI, Jiankang ZHAO, Jingran XU, Haihui LONG, Chuanqi LIU. Remote Sensing Image Fusion Method Based on Improved Swin Transformer[J]. Acta Photonica Sinica, 2023, 52(11): 1110001 Copy Citation Text show less
    Overall network structure
    Fig. 1. Overall network structure
    Detail injection model
    Fig. 2. Detail injection model
    Multi-scale CNN and channel attention module
    Fig. 3. Multi-scale CNN and channel attention module
    Structure of feature reconstruction network
    Fig. 4. Structure of feature reconstruction network
    Fusion result of WorldView-4 simulation dataset
    Fig. 5. Fusion result of WorldView-4 simulation dataset
    Residual graph of WorldView-4 simulation dataset
    Fig. 6. Residual graph of WorldView-4 simulation dataset
    Fusion result of QuickBird simulation dataset
    Fig. 7. Fusion result of QuickBird simulation dataset
    Residual graph of QuickBird simulation dataset
    Fig. 8. Residual graph of QuickBird simulation dataset
    Fusion result of WorldView-2 simulation dataset
    Fig. 9. Fusion result of WorldView-2 simulation dataset
    Residual graph of WorldView-2 simulation dataset
    Fig. 10. Residual graph of WorldView-2 simulation dataset
    Fusion result of WorldView-4 real dataset
    Fig. 11. Fusion result of WorldView-4 real dataset
    Three different window attention unit structures
    Fig. 12. Three different window attention unit structures
    1for i in epochsi个epoch,最大epoch个数设为200
    2for j in batchesj个batch
    3Select 32 patches of PAN images;选取PAN数据集的32张图像;
    4Select 32 patches of LRMS images;选取LRMS数据集的32张图像;
    5Select 32 patches of HRMS images;选取HRMS数据集的32张图像;
    6Produce the output P̑=fPAN,LRMS计算模型生成的融合图像;
    7Calculate the loss L计算融合图像和参考图像的损失函数L
    8Update parameters by AdamOptimizer;根据L,利用Adam优化器更新模型的参数;
    9end
    10end
    Table 0. [in Chinese]
    Training datasetTesting dataset(reduced resolution)Testing dataset(full resolution)
    NumberSizeNumberSizeNumberSize
    WV4LRMS22 00016×16×45064×64×450256×256×4
    PAN22 00064×64×150256×256×1501 024×1 024×1
    HRMS22 00064×64×450256×256×4--
    QBLRMS22 00016×16×45064×64×450256×256×4
    PAN22 00064×64×150256×256×1501 024×1 024×1
    HRMS22 00064×64×450256×256×4--
    WV2LRMS22 00016×16×85064×64×850256×256×8
    PAN22 00064×64×150256×256×1501 024×1 024×1
    HRMS22 00064×64×850256×256×8--
    Table 1. Specific information about the dataset
    WV4QBWV2
    MethodERGAS↓SAM↓PSNR↑SCC↑ERGAS↓SAM↓PSNR↑SCC↑ERGAS↓SAM↓PSNR↑SCC↑
    MTF-GLP6.3405.77223.5240.9142.6982.33437.2710.8576.3387.69926.8910.878
    Wavelet6.4256.46023.4010.8644.3162.98132.1600.6606.7038.43526.0960.845
    PCA6.5057.33723.3260.8782.9813.16236.5840.7927.8818.84225.0810.828
    IHS5.6615.39424.4860.9022.8262.57336.0480.7236.4547.78026.6280.876
    MSDCNN2.8113.23230.5900.9731.3591.46843.3340.9534.0365.14530.9380.944
    FusionNet2.9103.19030.2800.9721.2701.36943.8560.9593.8455.05031.2170.948
    Panformer2.8203.17030.6770.9751.2511.36244.0770.9613.8885.01331.2290.948
    LAGConv2.6933.11030.9560.9761.2721.40643.8130.9583.8785.07031.1400.947
    TFNet2.5853.11531.3900.9781.2381.34444.1540.9613.7955.00331.3970.950
    MSCANet2.2752.83132.4780.9821.2331.31044.2020.9623.6654.86931.6910.953
    Table 2. Objective evaluation index of simulation dataset
    MethodWV4QBWV2
    DλDSQNR↑DλDSQNR↑DλDSQNR↑
    MTF-GLP0.065 50.050 90.887 10.095 70.150 90.768 90.094 20.065 30.847 0
    Wavelet0.014 10.039 80.946 70.133 50.151 40.738 20.046 90.073 10.883 6
    PCA0.034 80.064 70.902 80.016 40.083 90.901 10.069 50.056 80.877 6
    IHS0.013 30.067 00.920 60.018 00.091 90.891 80.025 90.047 20.928 2
    MSDCNN0.024 00.016 40.960 00.013 20.034 00.953 30.018 00.046 20.936 6
    FusionNet0.027 80.026 40.946 50.014 10.029 10.957 30.017 20.031 60.951 8
    Panformer0.040 00.018 80.942 00.015 10.037 30.948 20.020 30.031 40.948 9
    LAGConv0.030 60.018 90.951 10.014 90.054 10.931 80.017 70.029 50.953 4
    TFNet0.019 90.026 10.954 50.015 40.041 60.943 60.014 90.048 70.937 2
    MSCANet0.018 10.008 80.973 20.011 00.031 10.958 20.015 50.023 40.961 5
    Table 3. Objective evaluation index of real dataset
    ModelERGASSAMPSNR↑SCC↑
    Non-injection model2.4122.92631.9440.980
    Injection model2.2682.81632.4880.983
    Table 4. Ablation result of injection model in WV4 dataset
    StructureMLPMulti-scale CNNChannel-attentionERGASSAMPSNR↑SCC↑
    Fig.12(a)2.7673.15630.7630.974
    Fig.12(b)2.3772.91632.1000.981
    Fig.12(c)2.2682.81632.4880.983
    Table 5. Ablation result of MSCA in WV4 dataset
    MAESpectral lossSpatial lossERGASSAMPSNR↑SCC↑
    2.3622.87332.1090.981
    2.3432.84532.2110.981
    2.3292.90032.2610.982
    2.2682.81632.4880.983
    Table 6. Ablation result of loss function in WV4 dataset
    MethodRuntime/sParameters
    MTF-GLP0.919-
    Wavelet0.095-
    PCA0.122-
    IHS0.105-
    MSDCNN0.0460.19×106
    FusionNet0.0530.15×106
    Panformer0.1971.85×106
    LAGConv0.0790.05×106
    TFNet0.1252.36×106
    MSCANet0.1471.99×106
    Table 7. Average test time and number of parameters for all methods
    Zitong LI, Jiankang ZHAO, Jingran XU, Haihui LONG, Chuanqi LIU. Remote Sensing Image Fusion Method Based on Improved Swin Transformer[J]. Acta Photonica Sinica, 2023, 52(11): 1110001
    Download Citation