Yanqiu Li, Shengzhao Li, Guangling Sun, Pu Yan. Lightweight Swin Transformer combined with multi-scale feature fusion for face expression recognition[J]. Opto-Electronic Engineering, 2025, 52(1): 240234

Search by keywords or author
- Opto-Electronic Engineering
- Vol. 52, Issue 1, 240234 (2025)

Fig. 1. Swin Transformer network structure diagram

Fig. 2. Swin Transformer block module structure diagram

Fig. 3. Self-attention computing area. (a) MSA; (b) W-MSA; (c) SW-MSA

Fig. 4. Improved model structure diagram

Fig. 5. SPST module structure diagram

Fig. 6. A visual view of the BN, LN, and BCN standardization technology

Fig. 7. EMA module structure diagram

Fig. 8. Activation maps of the model before and after adding EMA module

Fig. 9. A partial sample of datasets

Fig. 10. Confusion matrix validation results on JAFFE. (a) Original Swin Transformer model; (b) Improved Swin Transformer model

Fig. 11. Confusion matrix validation results on RAF-DB. (a) Original Swin Transformer model; (b) Improved Swin Transformer model

Fig. 12. Confusion matrix validation results on FERPLUS. (a) Original Swin Transformer model; (b) Improved Swin Transformer model

Fig. 13. Confusion matrix validation results on FANE. (a) Original Swin Transformer model; (b) Improved Swin Transformer model
|
Table 1. Comparison of parameters before and after the model is improved
|
Table 2. Experimental comparison of replacing SPST modules in different stages
|
Table 3. Entropy comparison of activation maps
|
Table 4. Configuration of the experimental environment
|
Table 5. Accuracy of embedding the EMA module behind different stages
|
Table 6. Results of ablation experiments on FERPLUS, RAF-DB, and FANE
|
Table 7. Accuracy comparsion of different networks on JAFFE,FERPLUS, and RAF-DB

Set citation alerts for the article
Please enter your email address