Lightweight Swin Transformer combined with multi-scale feature fusion for face expression recognition

Yanqiu Li; Shengzhao Li; Guangling Sun; Pu Yan

doi:10.12086/oee.2025.240234

[1] C Z Tian, M Song, J W Tian et al. Combination weighting and FCE based evaluation for human-computer interaction effectiveness of command and control system. Electron Opt Control, 31, 87-96(2024).

[2] Y H Li, X Q Lü, Y Gu et al. Face detection algorithm based on improved S3FD network. Laser Technol, 45, 722-728(2021).

[3] R Sun, X Q Shan, Q J Sun et al. NIR-VIS face image translation method with dual contrastive learning framework. Opto-Electron Eng, 49, 210317(2022).

[4] K Wang, X J Peng, J F Yang et al. Suppressing uncertainties for large-scale facial expression recognition, 6896-6905(2020).

[5] W X Zhang, Y H Luo, Y Q Liu et al. Image super-resolution reconstruction based on active displacement imaging. Opto-Electron Eng, 51, 230290(2024).

[6] C Liu, L C Cao, Y Jin et al. Transformer for age-invariant face recognition. Laser Optoelectron Prog, 60, 1010019(2023).

[7] Y Yaddaden, M Adda, A Bouzouane. Facial expression recognition using locally linear embedding with LBP and HOG descriptors, 221-226(2021). https://doi.org/10.1109/IHSH51661.2021.9378702

[8] K Wang, X J Peng, J F Yang et al. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process, 29, 4057-4069(2020).

[9] A T Wasi, K Šerbetar, R Islam et al. ARBEx: attentive feature extraction with reliability balancing for robust facial expression learning(2024). https://doi.org/10.48550/arxiv.2305.01486

[10] Y Z Liu, Z M Xu, C Y Lang et al. Fine-grained facial expression recognition algorithm based on relationship-awareness and label disambiguation. Acta Electron Sin, 52, 3336-3346(2024).

[11] Y Chen, L C Wu, C Wang. A micro-expression recognition method based on multi-level information fusion network. Acta Autom Sin, 50, 1445-1457(2024).

[12] C C Zhang, S Wang, W Y Wang et al. Adversarial background attacks in a limited area for CNN based face recognition. Opto-Electron Eng, 50, 220266(2023).

[13] X G Wei. Research on facial expression recognition method based on convolutional neural network(2023). https://doi.org/10.27272/d.cnki.gshdu.2023.006762

[14] A Vaswani, N Shazeer, N Parmar et al. Attention is all you need, 6000-6010(2017).

[15] M Chen, A Radford, R Child et al. Generative pretraining from pixels, 1691-1703(2020).

[16] J Devlin, M W Chang, K Lee et al. BERT: pre-training of deep bidirectional transformers for language understanding, 4171-4186(2019). https://doi.org/10.18653/v1/N19-1423

[17] C Liu, K Hirota, Y P Dai. Patch attention convolutional vision transformer for facial expression recognition with occlusion. Inf Sci, 619, 781-794(2023).

[18] X C Chen, X W Zheng, K Sun et al. Self-supervised vision transformer-based few-shot learning for facial expression recognition. Inf Sci, 634, 206-226(2023).

[19] C Zheng, M Mendieta, C Chen. POSTER: a pyramid cross-fusion transformer network for facial expression recognition, 3138-3147(2023). https://doi.org/10.1109/ICCVW60793.2023.00339

[20] Z Liu, Y T Lin, Y Cao et al. Swin Transformer: hierarchical vision transformer using shifted windows, 9992-10002(2021). https://doi.org/10.1109/ICCV48922.2021.00986

[21] H Q Feng, W K Huang, D H Zhang et al. Fine-tuning Swin Transformer and multiple weights optimality-seeking for facial expression recognition. IEEE Access, 11, 9995-10003(2023).

[22] K Pinasthika, B S P Laksono, R B P Irsal et al. SparseSwin: Swin Transformer with sparse transformer block. Neurocomputing, 580, 127433(2024).

[23] D L Ouyang, S He, G Z Zhang et al. Efficient multi-scale attention module with cross-spatial learning, 1-5(2023). https://doi.org/10.1109/ICASSP49357.2023.10096516

[24] A Khaled, C Li, J Ning et al. BCN: batch channel normalization for image classification(2023). https://doi.org/10.48550/arxiv.2312.00596

[25] P N R Bodavarapu, P V V S Srinivas. Facial expression recognition for low resolution images using convolutional neural networks and denoising techniques. Indian J Sci Technol, 14, 971-983(2021).

[26] M Sandler, A Howard, M L Zhu et al. MobileNetV2: inverted residuals and linear bottlenecks, 4510-4520(2018). https://doi.org/10.1109/CVPR.2018.00474

[27] A Howard, M Sandler, B Chen et al. Searching for MobileNetV3, 1314-1324(2019). https://doi.org/10.1109/ICCV.2019.00140

[28] A P Fard, M H Mahoor. Ad-corre: adaptive correlation-based loss for facial expression recognition in the wild. IEEE Access, 10, 26756-26768(2022).

[29] Y C Zhu, L L Wei, C Y Lang et al. Fine-grained facial expression recognition via relational reasoning and hierarchical relation optimization. Pattern Recognit Lett, 164, 67-73(2022).

[30] H Y Li, N N Wang, X Yang et al. Towards semi-supervised deep facial expression recognition with an adaptive confidence margin, 4156-4165(2022). https://doi.org/10.1109/CVPR52688.2022.00413

微信扫一扫：分享

微信扫一扫：分享