• Acta Optica Sinica
  • Vol. 39, Issue 3, 0301002 (2019)
Xi Gong1, Liang Wu1、2, Zhong Xie1、2, Zhanlong Chen1、2, Yuanyuan Liu1、*, and Kan Yu3
Author Affiliations
  • 1 Department of Information Engineering, China University of Geosciences, Wuhan, Hubei 430074, China
  • 2 National Engineering Research Center of Geographic Information System, Wuhan, Hubei 430074, China
  • 3 Department of Information Science and Technology, Wenhua College, Wuhan, Hubei 430074, China
  • show less
    DOI: 10.3788/AOS201939.0301002 Cite this Article Set citation alerts
    Xi Gong, Liang Wu, Zhong Xie, Zhanlong Chen, Yuanyuan Liu, Kan Yu. Classification Method of High-Resolution Remote Sensing Scenes Based on Fusion of Global and Local Deep Features[J]. Acta Optica Sinica, 2019, 39(3): 0301002 Copy Citation Text show less
    Flow chart of GLDFB
    Fig. 1. Flow chart of GLDFB
    Network structure of VGG-19
    Fig. 2. Network structure of VGG-19
    Reconstruction and coding of convolutional layer features
    Fig. 3. Reconstruction and coding of convolutional layer features
    Image examples of remote sensing scene. (a) UCM dataset; (b) SIRI dataset
    Fig. 4. Image examples of remote sensing scene. (a) UCM dataset; (b) SIRI dataset
    Time consumption for single iteration in k-means clustering process of 12 convolutional layer features under different K values. (a) UCM dataset; (b) SIRI dataset
    Fig. 5. Time consumption for single iteration in k-means clustering process of 12 convolutional layer features under different K values. (a) UCM dataset; (b) SIRI dataset
    Classification accuracies of 12 convolutional layer features under different K values. (a) UCM dataset; (b) SIRI dataset
    Fig. 6. Classification accuracies of 12 convolutional layer features under different K values. (a) UCM dataset; (b) SIRI dataset
    Classification confusion matrix of GLDFB on UCM dataset
    Fig. 7. Classification confusion matrix of GLDFB on UCM dataset
    Two kinds of misclassified scenes. (a) Road type; (b) building type
    Fig. 8. Two kinds of misclassified scenes. (a) Road type; (b) building type
    Classification confusion matrix of GLDFB on SIRI dataset
    Fig. 9. Classification confusion matrix of GLDFB on SIRI dataset
    GLDFB results. (a) USGS large remote sensing image; (b) classification result
    Fig. 10. GLDFB results. (a) USGS large remote sensing image; (b) classification result
    No.Layer nameFeature size
    1conv1_164×224× 224
    2conv1_264×224× 224
    3conv2_1128×112×112
    4conv2_2128×112×112
    5conv3_1256×56×56
    6conv3_2256×56×56
    7conv3_3256×56×56
    8conv3_4256×56×56
    9conv4_1512×28×28
    10conv4_2512×28×28
    11conv4_3512×28×28
    12conv4_4512×28×28
    13conv5_1512×14×14
    14conv5_2512×14×14
    15conv5_3512×14×14
    16conv5_4512×14×14
    Table 1. Output feature dimensions of VGG-19 convolutional layers
    Layer typeUCMSIRI
    K=100K=500K=1000K=2000K=3000K=100K=500K=1000K=1500K=2000
    Middle layer90.1494.2494.6095.8995.4291.2293.4993.9194.5894.32
    Middle-high layer89.7695.1895.4295.9596.4989.4893.9694.5194.9195.16
    High layer88.8794.4694.9495.4294.8887.8092.1292.8893.6593.44
    Table 2. Average classification accuracy comparison of three kinds of convolutional layer features under different K values
    DatasetUCMSIRI
    FeatureHOGSIFTLBPCNN (6conv+2fc)HOGSIFTLBPCNN (6conv+2fc)
    Accuracy /%52.1458.3331.4363.1044.7953.9646.2560.42
    Table 3. Classification accuracies of several other features
    No.FeatureAccuracy /%
    UCMSIRI
    1FC694.6093.54
    2conv4_196.9095.63
    3SIFT+HOG73.8167.92
    4SIFT+FC695.0095.00
    5GLDFB(conv4_1+FC6)97.6296.67
    Table 4. Classification accuracy comparison of many kinds of features
    No.MethodAccuracy /%
    1RF44.77
    2SIFT+BoVW76.81
    3SPCK[4]77.38
    4VGG-19 (training from scratch)83.48
    5Resnet50 (training from scratch)85.71
    6CaffeNet[11]93.42±1.00
    7DCT-CNN[7]95.76
    8GLDFB97.62
    Table 5. Classification accuracy comparison on UCM dataset
    No.MethodAccuracy /%
    1RF49.90
    2SIFT+BoVW75.63
    3SPMK[3]77.69±1.01
    4VGG-19(training from scratch)86.13
    5MeanStd-SIFI+LDA-H[17]86.29
    6Resnet50(training from scratch)89.26
    7GLDFB96.67
    Table 6. Classification accuracy comparison on SIRI dataset
    Pre-training modelLocal feature extraction layerAccuracy /%
    Local featureGlobal featureFused feature
    Alexnet[18]conv393.8195.2496.91
    Caffenet[19]conv394.0596.9097.62
    VGG-F[20]conv395.2496.1997.62
    VGG-M[20]conv395.0096.4397.62
    VGG-S[20]conv393.8196.4396.67
    VGG-16[14]conv4_195.0096.1995.95
    Resnet50[21]Res3a95.7196.9097.86
    Resnet101[21]Res3a95.2396.9097.86
    Table 7. Classification results of GLDFB with other pre-training CNNs
    Xi Gong, Liang Wu, Zhong Xie, Zhanlong Chen, Yuanyuan Liu, Kan Yu. Classification Method of High-Resolution Remote Sensing Scenes Based on Fusion of Global and Local Deep Features[J]. Acta Optica Sinica, 2019, 39(3): 0301002
    Download Citation