基于多视图跨模态特征融合的图像描述生成

张乃洲,赵云超,曹薇,张啸剑

Image captioning generation based on multiple-view cross-modal feature fusion

Naizhou ZHANG,Yunchao ZHAO,Wei CAO,Xiaojian ZHANG

表 5 MVCMFAF模型与其他模型在计算量、参数量和推理时间方面的比较

Tab.5 Comparison of computational complexity, parameter quantity and inference time between MVCMFAF model and other model