Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer

Xiaojun He; Xuan Liu; Xian Wei

doi:10.3788/LOP222166

Journals >Laser & Optoelectronics Progress >Volume 60 >Issue 14 >Page 1410019 > Article

Laser & Optoelectronics Progress
Vol. 60, Issue 14, 1410019 (2023)

Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer

Xiaojun He¹, Xuan Liu^1、2、*, and Xian Wei²

Author Affiliations

¹College of Software, Liaoning Technical University, Huludao 125105, Liaoning, China

²Quanzhou Institute of Equipment Manufacturing Haixi Institutes, Fujian Institute of Research on the Structure, Chinese Academy of Sciences, Quanzhou 362216, Fujian, China

show less

DOI: 10.3788/LOP222166 Cite this Article Set citation alerts

Xiaojun He, Xuan Liu, Xian Wei. Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(14): 1410019 Copy Citation Text

show less

Fig. 1. Diagram of dictionary learning

Download full size

Fig. 2. Flowchart of the proposed method

Download full size

Fig. 3. Batch normalization and layer normalization

Download full size

Fig. 4. Schematic of multilayer perceptron

Download full size

Fig. 5. Flowchart of attention module method

Download full size

Fig. 6. Attention module based on dictionary learning

Download full size

Fig. 7. RSSCN7 dataset

Download full size

Fig. 8. NWPU-RESISC45 dataset

Download full size

Fig. 9. AID dataset

Download full size

Fig. 10. Rate of change of classification accuracy on Gaussian noise images

Download full size

Dataset	Number of scene classes	Number of total images	Image size	Spatial resolution /m	Year
RSSCN7	7	2800	400×400		2015
NWPU-RESISC45	45	31500	256×256	~30-0.2	2016
AID	30	10000	600×600	~8-0.5	2017

Table 1. Introduction of datasets

Laboratory environment	Environment configuration
Language	Python3.8.6
Tool	PyCharm11.0.11
Framework	PyTorch1.9.1
CUDA	10.2

Table 2. Laboratory environment

Network	Accuracy /%
AlexNet	82.230
VGG	80.833
ResNet50	89.048
TNT	84.833
ViT	89.643
Proposed network	91.406

Table 3. Accuracy of different networks on RSSCN7 dataset

Network	Accuracy /%
Fine-tuned AlexNet	85.160
Fine-tuned VGGNet-16	90.360
Fine-tuned GoogLeNet	86.020
TNT	85.031
ViT	90.255
Proposed network	91.576

Table 4. Accuracy of different networks on NWPU-RESISC45 dataset

Network	Accuracy /%
CaffeNet	86.860
VGG-VD-16	86.590
ResNet152	89.130
GoogLeNet	83.440
TNT	80.450
ViT	85.514
Proposed network	89.218

Table 5. Accuracy of different networks on AID dataset

Parameter	RSSCN7		NWPU-RESISC45		AID
Parameter	ViT	Proposed method	ViT	Proposed method	ViT	Proposed method
kappa	0.900	0.916	0.934	0.947	0.883	0.909
F1	86.222	90.890	88.927	90.207	84.202	87.768
recall	85.986	91.142	88.984	90.286	84.147	87.662
precision	86.417	91.002	89.039	90.317	84.558	88.004

Table 6. Parameter indicators of two methods on three datasets

Network	Number of parameter /10⁶
AlexNet	6
VGG	13.3
ResNet50	2.55
TNT	2.25
ViT	2.6
Proposed method	1.84

Table 7. Parameters of different classification frameworks

Download Citation

Set citation alerts for the article

Tools

Set citation alerts for the article

Save the article for my favorites

Paper Information