• Acta Photonica Sinica
  • Vol. 51, Issue 3, 0310003 (2022)
Ying XIA1、*, Junyao LI1, and Dongen GUO1、2
Author Affiliations
  • 1Chongqing University of Posts and Telecommunications,Chongqing Engineering Research Center of Spatial Big Data Intelligent Technology,Chongqing 400065,China
  • 2School of Computer and Software,Nanyang Institute of Technology,Nanyang ,Henan 473000,China
  • show less
    DOI: 10.3788/gzxb20225103.0310003 Cite this Article
    Ying XIA, Junyao LI, Dongen GUO. Semi-supervised Scene Classification of Remote Sensing Images Based on GAN[J]. Acta Photonica Sinica, 2022, 51(3): 0310003 Copy Citation Text show less

    Abstract

    Remote sensing image scene classification is an important and challenging problem of remote sensing image interpretation. With the generation of a large number of scene-rich high-resolution remote sensing images, scene classification of remote sensing images is widely used in many fields such as smart city construction, natural disaster monitoring and land resource utilization. Due to the advancement of deep learning techniques and the establishment of large-scale scene classification datasets, scene classification methods have been significantly improved. Although the classification methods based on deep learning have achieved high classification accuracy, the supervised methods require a large number of training samples, while the unsupervised classification methods are difficult to meet the practical needs and have low classification accuracy. Meanwhile, the annotation of remote sensing images requires rich engineering skills and expert knowledge, and in remote sensing applications, only a small amount of labeled remote sensing images exist for supervised training in most cases, and a large amount of unlabeled images cannot be fully utilized. Therefore, a semi-supervised learning method that extracts effective features from a large amount of unlabeled data by learning a small amount of labeled data becomes a potential way to solve such problems. To address the problems of complex background of remote sensing images and the inability of supervised scene classification algorithms to utilize unlabeled data, a semi-supervised remote sensing image scene classification method based on generative adversarial networks, namely, residual attention generative adversarial networks, is proposed. First, to enhance the stability of training, the residual blocks with jump structure are introduced in the deep neural network. At the same time, the spectral normalization constrains the spectral norm of the weight matrix in each convolutional layer of the residual block to ensure that the input and output of each batch of data satisfy the 1-Lipschitz continuity, which makes the generative adversarial training always smooth, not only improves the training stability, but also avoids network degradation. Secondly, since the shallow features extracted by the bottom convolution contain mostly local information and low semantics, while the deep features extracted by the top convolution contain more global information but lose part of the detail information. Therefore, the shallow features are fused with the deep features extracted from the multi-layer spectral normalized residual blocks to reduce the loss of features and allow the model to learn the complementary relationships between different features, thus improving the model's representational ability. Finally, to guide the model to focus more purposefully on important features and suppress unnecessary features, an attention module that mimics the signal processing of the human brain is used. Meanwhile, in order to obtain stronger feature representation ability and capture the dependency relationship between features, a gating mechanism is introduced to form an attention module combined with gating. To verify the superiority of the method, experiments were conducted on two high-resolution remote sensing image datasets, EuroSAT and UC Merced. In the EuroSAT dataset, the highest classification accuracy reached 93.3% and 97.4% when the number of labeled features was 2 000 and 21 600, respectively. In the UC Merced dataset, the classification accuracies reached 85.7% and 91.0% when the number of labeled was 400 and 1 680, respectively. To further validate the degree of contribution of each module, ablation experiments were also conducted in the EuroSAT and UCM public datasets, and it can be concluded from the validation that the spectral normalization residual module has the largest contribution, with improvement for different number of labeled samples. The reason is that the spectral normalization ensures that the gradient of the network is limited to a certain range during backpropagation, improving the stability of the generative adversarial network, and also does not destroy the network structure in the process. The next is the attention module combined with gating, especially when the labeled sample size is greater than 10%, the classification effect is improved more because the sample size is sufficient to learn more comprehensive features. The smallest contribution is the feature fusion module, because when the sample size is very small, the network is not sufficiently trained and learned, and a part of redundant or invalid features are extracted, resulting in lower classification accuracy. The above experimental results show that the proposed residual attention generation adversarial network classification method can effectively extract more discriminative features and improve the semi-supervised classification performance for the problem of small sample size of labeled high-resolution remote sensing images, which makes it difficult to extract discriminative features.
    Ying XIA, Junyao LI, Dongen GUO. Semi-supervised Scene Classification of Remote Sensing Images Based on GAN[J]. Acta Photonica Sinica, 2022, 51(3): 0310003
    Download Citation