• Laser & Optoelectronics Progress
  • Vol. 60, Issue 14, 1410019 (2023)
Xiaojun He1, Xuan Liu1、2、*, and Xian Wei2
Author Affiliations
  • 1College of Software, Liaoning Technical University, Huludao 125105, Liaoning, China
  • 2Quanzhou Institute of Equipment Manufacturing Haixi Institutes, Fujian Institute of Research on the Structure, Chinese Academy of Sciences, Quanzhou 362216, Fujian, China
  • show less
    DOI: 10.3788/LOP222166 Cite this Article Set citation alerts
    Xiaojun He, Xuan Liu, Xian Wei. Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(14): 1410019 Copy Citation Text show less
    Diagram of dictionary learning
    Fig. 1. Diagram of dictionary learning
    Flowchart of the proposed method
    Fig. 2. Flowchart of the proposed method
    Batch normalization and layer normalization
    Fig. 3. Batch normalization and layer normalization
    Schematic of multilayer perceptron
    Fig. 4. Schematic of multilayer perceptron
    Flowchart of attention module method
    Fig. 5. Flowchart of attention module method
    Attention module based on dictionary learning
    Fig. 6. Attention module based on dictionary learning
    RSSCN7 dataset
    Fig. 7. RSSCN7 dataset
    NWPU-RESISC45 dataset
    Fig. 8. NWPU-RESISC45 dataset
    AID dataset
    Fig. 9. AID dataset
    Rate of change of classification accuracy on Gaussian noise images
    Fig. 10. Rate of change of classification accuracy on Gaussian noise images
    DatasetNumber of scene classesNumber of total imagesImage sizeSpatial resolution /mYear
    RSSCN772800400×4002015
    NWPU-RESISC454531500256×256~30-0.22016
    AID3010000600×600~8-0.52017
    Table 1. Introduction of datasets
    Laboratory environmentEnvironment configuration
    LanguagePython3.8.6
    ToolPyCharm11.0.11
    FrameworkPyTorch1.9.1
    CUDA10.2
    Table 2. Laboratory environment
    NetworkAccuracy /%
    AlexNet82.230
    VGG80.833
    ResNet5089.048
    TNT84.833
    ViT89.643
    Proposed network91.406
    Table 3. Accuracy of different networks on RSSCN7 dataset
    NetworkAccuracy /%
    Fine-tuned AlexNet85.160
    Fine-tuned VGGNet-1690.360
    Fine-tuned GoogLeNet86.020
    TNT85.031
    ViT90.255
    Proposed network91.576
    Table 4. Accuracy of different networks on NWPU-RESISC45 dataset
    NetworkAccuracy /%
    CaffeNet86.860
    VGG-VD-1686.590
    ResNet15289.130
    GoogLeNet83.440
    TNT80.450
    ViT85.514
    Proposed network89.218
    Table 5. Accuracy of different networks on AID dataset
    ParameterRSSCN7NWPU-RESISC45AID
    ViTProposed methodViTProposed methodViTProposed method
    kappa0.9000.9160.9340.9470.8830.909
    F186.22290.89088.92790.20784.20287.768
    recall85.98691.14288.98490.28684.14787.662
    precision86.41791.00289.03990.31784.55888.004
    Table 6. Parameter indicators of two methods on three datasets
    NetworkNumber of parameter /106
    AlexNet6
    VGG13.3
    ResNet502.55
    TNT2.25
    ViT2.6
    Proposed method1.84
    Table 7. Parameters of different classification frameworks
    Xiaojun He, Xuan Liu, Xian Wei. Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(14): 1410019
    Download Citation