基于跨模态级联扩散模型的图像描述方法
陈巧红,郭孟浩,方贤,孙麒

Image captioning based on cross-modal cascaded diffusion model
Qiaohong CHEN,Menghao GUO,Xian FANG,Qi SUN
表 1 模型在2个数据集上的模块消融实验
Tab.1 Module ablation experiment of model in two datasets
SAMCDMMicrosoft COCOFlickr30k
B@1B@4MRCB@1B@4MRC
××77.634.527.556.5115.268.527.522.250.159.1
×78.934.827.556.8116.169.728.322.550.962.0
×80.038.228.357.8128.772.229.823.451.863.5
81.239.929.058.9133.874.531.223.953.265.4