基于多模态语义信息的文本生成图像方法
杨冰,周家辉,姚金良,向学勤

Text-to-image generation method based on multimodal semantic information
Bing YANG,Jiahui ZHOU,Jinliang YAO,Xueqin XIANG
表 2 不同模型在COCO数据集上的图像生成速度对比
Tab.2 Comparison of image generation speed for different models on COCO dataset
模型类型tg/snp/109ZS-FID↓
Make-a-scene自回归9.408.011.84
LDM扩散15.001.512.63
UFOGen扩散+GAN0.090.912.78
本研究GAN0.040.312.48