基于视觉关系推理与上下文门控机制的图像描述
|
陈巧红,裴皓磊,孙麒
|
Image caption based on relational reasoning and context gate mechanism
|
Qiao-hong CHEN,Hao-lei PEI,Qi SUN
|
|
表 3 Microsoft COCO 数据集实验性能对比 |
Tab.3 Comparison of experimental results on Microsoft COCO caption dateset |
|
模型 | BLEU-1 | BLEU-4 | METEOR | ROUGE | CIDEr | SPICE | Att2in (XE) | — | 31.3 | 26.0 | 54.3 | 101.3 | — | GL-Att (XE) | 74.0 | 35.2 | 27.5 | 52.4 | 98.7 | — | LRCA (XE) | 75.9 | 35.8 | 27.8 | 56.4 | 111.3 | — | Adaptive (XE) | 74.2 | 33.2 | 26.6 | — | 108.5 | 19.5 | NBT (XE) | 75.5 | 34.7 | 27.1 | — | 107.2 | 20.1 | Updown (XE) | 77.2 | 36.2 | 27.0 | 56.4 | 113.5 | 20.3 | POS-SCAN (XE) | 76.6 | 36.5 | 27.9 | — | 114.9 | 20.8 | RFNet (XE) | 77.5 | 36.8 | 27.2 | 56.8 | 115.3 | 20.5 | 本研究(XE) | 77.4 | 36.9 | 28.1 | 57.3 | 118.7 | 21.1 | Att2in (CIDEr) | — | 33.3 | 26.3 | 55.3 | 111.4 | — | Updown (CIDEr) | 79.8 | 36.3 | 27.7 | 56.9 | 120.1 | 21.4 | POS-SCAN (CIDEr) | 80.1 | 37.8 | 28.3 | — | 125.9 | 22.0 | RFNet (CIDEr) | 79.1 | 36.5 | 27.7 | 57.3 | 121.9 | 21.2 | JCRR (CIDEr) | — | 37.7 | 28.2 | — | 120.1 | 21.6 | 本研究(CIDEr) | 80.1 | 38.1 | 29.0 | 58.7 | 127.1 | 22.1 |
|
|
|