基于Transformer的多模态级联文档布局分析网络
温绍杰,吴瑞刚,冯超文,刘英莉

Multimodal cascaded document layout analysis network based on Transformer
Shaojie WEN,Ruigang WU,Chaowen FENG,Yingli LIU
表 2 所提模型与现有模型在PublayNet数据集上的整体性能
Tab.2 Overall performance of proposed model and existing models on PublayNet dataset
模型主干网络mAP/%
PublayNet[24]Mask R-CNN91.0
DiT[13]Mask R-CNN91.6
DiT[13]Cascader R-CNN92.5
UDoc[25]Faster R-CNN91.7
BEiT[23]Mask R-CNN92.6
MCOD-NetCascader R-CNN95.1