基于多模态语义信息的文本生成图像方法
杨冰,周家辉,姚金良,向学勤

Text-to-image generation method based on multimodal semantic information
Bing YANG,Jiahui ZHOU,Jinliang YAO,Xueqin XIANG
表 1 不同模型在2个数据集上的评估指标对比
Tab.1 Comparison of evaluation metrics for different models on two datasets
模型CUB数据集COCO数据集
FID↓IS↑SCLIPFID↓IS↑SCLIP
VQ-Diffusion[23]10.320.322 413.860.338 2
DFGAN14.815.100.292 019.3235.160.297 2
RATGAN13.915.3614.6036.42
DMF-GAN[24]13.215.4215.8336.72
SAW-GAN[25]10.454.6311.1735.17
GALIP10.085.920.316 45.8537.110.333 8
本研究9.566.040.325 25.6237.360.340 5