• Optoelectronics Letters
  • Vol. 17, Issue 6, 361 (2021)
Dexin ZHAO, Ruixue YANG*, and Shutao GUO
Author Affiliations
  • Tianjin Key Laboratory1of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin 300384, China
  • show less
    DOI: 10.1007/s11801-021-0100-z Cite this Article
    ZHAO Dexin, YANG Ruixue, GUO Shutao. A lightweight convolutional neural network for large-scale Chinese image caption[J]. Optoelectronics Letters, 2021, 17(6): 361 Copy Citation Text show less
    References

    [1] Li X, Uricchio T, Ballan L, Bertini M, Snoek C.G and Bimbo A.D, ACM Computing Surveys 49, 1 (2016).

    [2] Vinyals O, Toshev A, Bengio S and Erhan D, Show and Tell: A Neural Image Caption Generator, IEEE Conference on Computer Vision and Pattern Recognition, 3156 (2015).

    [3] Jia X, Gavves E, Fernando B and Tuytelaars T, Guiding the Long-Short Term Memory Model for Image Caption Generation, IEEE International Conference on Computer Vision IEEE Computer Society, 2407 (2015).

    [4] Lu J, Yang J, Batra D and Parikh D, Neural Baby Talk, Conference on Computer Vision and Pattern Recognition, 7219 (2018).

    [5] Rennie S J, Marcheret E, Mroueh Y, Ross J and Goel V, Self-Critical Sequence Training for Image Captioning, IEEE Conference on Computer Vision and Pattern Recognition, 7008 (2017).

    [6] Yang J, Sun Y, Liang J, Ren B and Lai S, Neurocomputing 328, 56 (2019).

    [7] Szegedy C, Vanhoucke V, Ioffe S, Shlens J and Wojna Z, Rethinking the Inception Architecture for Computer Vision, IEEE Conference on Computer Vision and Pattern Recognition, 2818 (2016).

    [8] Liu Z, Ma L, Wu J and Sun L, Journal of Chinese Information Processing 31, 162 (2017). (in Chinese)

    [9] Lan W, Wang X, Yang G and LI X, Chinese Journal of Computers 42, 136 (2019). (in Chinese)

    [10] Zhao D, Chang Z and Guo S, Neurocomputing 329, 476 (2019).

    [11] Srivastava R, Greff K and Schmidhuber J, Training Very Deep Networks, Advances in Neural Information Processing Systems, 2368 (2015).

    [12] Kulkarni G, Premraj V, Dhar S, Li S, Choi Y, Berg A and Berg T, Baby Talk: Understanding and Generating Simple Image Descriptions, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2891 (2014).

    [13] Wu J, Zheng H, Zhao B, Li Y, Yan B, Liang R, Wang W, Zhou S, Lin G, Fu Y, Wang Y and Wang Y, Large-Scale Datasets for Going Deeper in Image Understanding, IEEE International Conference on Multimedia and Expo (ICME), 1480 (2019).

    [14] He K, Zhang X, Ren S and Sun Y, Deep Residual Learning for Image Recognition, IEEE Conference on Computer Vision and Pattern Recognition, 770 (2014).

    [15] Szegedy C, Ioffe S, Vanhoucke V and Alemi A A, Inception- v4, inception-resnet and the impact of residual connections on learning, AAAI Conference on Artificial Intelligence, 4278 (2017).

    [16] Papineni K, Roukos S, Ward T and Zhu W, Bleu: A Method for Automatic Evaluation of Machine Translation, 40th Annual Meeting of the Association for Computational Linguistics, 311 (2002).

    [17] Banerjee S and Lavie A, METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments, Meeting of the association for computational linguistics, 65 (2005).

    [18] Lin C, ROUGE: A Package for Automatic Evaluation of Summaries, Meeting of the Association for Computational Linguistics, 74 (2004).

    [19] Vedantam R, Zitnick C L and Parikh D, CIDEr: Consensus- Based Image Description Evaluation, Computer Vision and Pattern Recognition, 4566 (2015).

    ZHAO Dexin, YANG Ruixue, GUO Shutao. A lightweight convolutional neural network for large-scale Chinese image caption[J]. Optoelectronics Letters, 2021, 17(6): 361
    Download Citation