Text Image Generation Method with Scene Description

Youwen Huang; Bin Zhou; Xin Tang

doi:10.3788/LOP202158.0410012

Journals >Laser & Optoelectronics Progress >Volume 58 >Issue 4 >Page 0410012 > Article

Laser & Optoelectronics Progress
Vol. 58, Issue 4, 0410012 (2021)

Text Image Generation Method with Scene Description

Youwen Huang, Bin Zhou^*, and Xin Tang

Author Affiliations

School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China

show less

DOI: 10.3788/LOP202158.0410012 Cite this Article Set citation alerts

Youwen Huang, Bin Zhou, Xin Tang. Text Image Generation Method with Scene Description[J]. Laser & Optoelectronics Progress, 2021, 58(4): 0410012 Copy Citation Text

show less

Fig. 1. Generation network model

Download full size

Fig. 2. Mask generation network

Download full size

Fig. 3. Discrimination network model

Download full size

Fig. 4. Layout discriminator

Download full size

Fig. 5. Comparison results of same description

Download full size

Fig. 6. Comparison results after adding objects

Download full size

Fig. 7. Comparison results of predicted mask

Download full size

t	0.3	0.4	0.5	0.6	0.7
Number of objects	156	151	146	127	103
Number of relationship types	38	38	37	30	24

Table 1. Preprocessing results under different threshold values

Model	IS	FID
Real image(64×64)	13.90±0.50	0
Proposed model(no D_layout)	6.72±0.24	57.48
Proposed model(no G_mask)	6.69±0.14	61.34
Proposed model(full model)	7.11±0.14	42.20
Sg2im^[11]	6.30±0.20	73.39
StackGAN^[8]	6.35±0.16	108.68
AttnGAN^[10]	6.38±0.22	96.40