• Laser & Optoelectronics Progress
  • Vol. 58, Issue 12, 1210030 (2021)
Lie Guo1、*, Tuanshan Zhang1, Weizhen Sun2, and Jielong Guo2
Author Affiliations
  • 1Xi'an Key Laboratory of Modern Intelligent Textile Equipment, College of Mechanical and Electrical Engineering, Xi'an Polytechnic University, Xi'an, Shaanxi 710600, China
  • 2Quanzhou Institute of Equipment Manufacturing, Haixi Institutes, Chinses Academy of Science, Quanzhou, Fujian 362216, China
  • show less
    DOI: 10.3788/LOP202158.1210030 Cite this Article Set citation alerts
    Lie Guo, Tuanshan Zhang, Weizhen Sun, Jielong Guo. Image Semantic Description Algorithm with Integrated Spatial Attention Mechanism[J]. Laser & Optoelectronics Progress, 2021, 58(12): 1210030 Copy Citation Text show less
    Encoder-decoder model with integrated spatial attention mechanism
    Fig. 1. Encoder-decoder model with integrated spatial attention mechanism
    Diagram of spatial attention module
    Fig. 2. Diagram of spatial attention module
    Diagram of encoder-decoder network with integrated spatial attention mechanism
    Fig. 3. Diagram of encoder-decoder network with integrated spatial attention mechanism
    Experimental data loss curves of spatial attention mechanism in VGG network. (a) VGG(MSCOCO); (b) VGG(Flickr30k)
    Fig. 4. Experimental data loss curves of spatial attention mechanism in VGG network. (a) VGG(MSCOCO); (b) VGG(Flickr30k)
    Experimental data loss curves of spatial attention mechanism in ResNet network. (a) ResNet-50(MSCOCO); (b) ResNet-50(Flickr30k)
    Fig. 5. Experimental data loss curves of spatial attention mechanism in ResNet network. (a) ResNet-50(MSCOCO); (b) ResNet-50(Flickr30k)
    Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    Fig. 6. Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    Fig. 7. Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    Fig. 8. Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    Fig. 9. Comparison of visualization results. (a) Test set; (b) SAT model visualization results; (c) proposed model visualization results
    InstructionParameter
    Memory16.0 GB
    CPUInter(R)Core(TM)i7-6700 CPU @3.40 GHz
    GPUNVIDIA GeForce GTX 1080 Ti
    Table 1. Server configuration used for the experiment
    Dataset nameTrainValidTest
    Flickr30k2978310001000
    MSCOCO827834050440775
    Table 2. Experimental server configuration
    Model(VGG)MSCOCOFlickr30k
    BLEU-1BLEU-2BLEU-3BLEU-4BLEU-1BLEU-2BLEU-3BLEU-4
    Deep VS62.545.032.123.057.336.924.016.0
    Log bilinear70.848.934.424.360.038.025.417.1
    SAT70.749.234.424.361.040.527.318.2
    Proposed71.951.937.226.262.241.528.219.0
    Table 3. Experimental comparison of spatial attention mechanism in VGG network
    Model(ResNet-50)MSCOCOFlickr30k
    BLEU-1BLEU-2BLEU-3BLEU-4BLEU-1BLEU-2BLEU-3BLEU-4
    Deep VS62.545.032.123.057.336.924.016.0
    Google NIC66.646.132.924.666.342.327.718.3
    m-RNN67.049.035.025.060.041.028.019.0
    SAT72.752.837.926.763.442.629.219.7
    Proposed73.053.539.027.964.643.930.320.6
    Table 4. Experimental comparison of spatial attention mechanism in ResNet network
    Lie Guo, Tuanshan Zhang, Weizhen Sun, Jielong Guo. Image Semantic Description Algorithm with Integrated Spatial Attention Mechanism[J]. Laser & Optoelectronics Progress, 2021, 58(12): 1210030
    Download Citation