• Laser & Optoelectronics Progress
  • Vol. 58, Issue 4, 0410012 (2021)
Youwen Huang, Bin Zhou*, and Xin Tang
Author Affiliations
  • School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China
  • show less
    DOI: 10.3788/LOP202158.0410012 Cite this Article Set citation alerts
    Youwen Huang, Bin Zhou, Xin Tang. Text Image Generation Method with Scene Description[J]. Laser & Optoelectronics Progress, 2021, 58(4): 0410012 Copy Citation Text show less

    Abstract

    In this paper, a method of generating corresponding images based on scene description text is studied, and a generative adversarial network model combined with scene description is proposed to solve the object overlapping and missing problems in the generated images. Initially, a mask generation network is used to preprocess the dataset to provide objects in the dataset with segmentation mask vectors. These vectors are used as constraints to train a layout prediction network by text description to obtain the specific location and size of each object in the scene layout. Then, the results are sent to the cascaded refinement network model to complete image generation. Finally, the scene layout and images are introduced to a layout discriminator to bridge the gap between them for obtaining a more realistic scene layout. The experimental results demonstrate that the proposed model can generate more natural images that better match the text description, effectively improving the authenticity and diversity of generated images.
    Youwen Huang, Bin Zhou, Xin Tang. Text Image Generation Method with Scene Description[J]. Laser & Optoelectronics Progress, 2021, 58(4): 0410012
    Download Citation