Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (6): 1186-1194    DOI: 10.3785/j.issn.1008-973X.2023.06.014
机械工程     
航空装配领域中命名实体识别的持续学习框架
刘沛丰(),钱璐,赵兴炜*(),陶波
华中科技大学 数字制造装备与技术国家重点实验室,湖北 武汉 430074
Continual learning framework of named entity recognition in aviation assembly domain
Pei-feng LIU(),Lu QIAN,Xing-wei ZHAO*(),Bo TAO
State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
 全文: PDF(1091 KB)   HTML
摘要:

为了构建航空装配领域中装配流程信息、装配技术知识、行业标准和三者内在联系组成的航空装配知识图谱,提出基于持续学习的命名实体识别技术框架. 所提框架的特点是从零语料到大规模语料的渐进式学习过程中,在不依赖人工设定特征的情况下,始终保持较高的识别效果. 从飞机总装配和部件对接的实际工业场景展开所提框架的性能对比实验,并以操纵拉杆和钢索的安装为实验案例. 实验结果表明,在处理不同规模的语料环境的情况下,所提框架在正确率、召回率、F1值上均显著优于以往算法,所提框架可以为航空装配领域命名实体识别任务持续提供可信的结果.

关键词: 智能制造航空装配命名实体识别持续学习深度学习    
Abstract:

In order to build an aviation assembly knowledge graph composed of assembly process information, assembly technology knowledge, related industry standards and internal connections of the three, a named entity recognition technology framework based on continual learning was proposed. The characteristic of the proposed framework was that it maintained high recognition performance throughout the progressive learning process from zero corpus to large-scale corpus, without relying on manual feature setting. A comparative performance experiment of the proposed framework was carried out in practical industrial scenarios, the experiment proceeded from general assembly and component assembly, and the manipulations of the pull rod and cable installation were regard as a specific experimental case. Experimental results show that the proposed framework is significantly better in accuracy, recall, and F1 value than previous algorithms, while handling different-scale corpus environments. And the credible results for named entity recognition tasks can be provided consistently by the proposed framework in the aviation assembly domain.

Key words: intelligent manufacturing    aviation assembly    named entity recognition    continual learning    deep learning
收稿日期: 2022-06-14 出版日期: 2023-06-30
CLC:  TU 111  
基金资助: 国家自然科学基金资助项目(52275020, 62293514)
通讯作者: 赵兴炜     E-mail: stevenpliu@hust.edu.cn;zhaoxingwei@hust.edu.cn
作者简介: 刘沛丰(1988—)男,高级工程师,博士生,从事知识图谱研究. orcid.org/0000-0001-5589-1662. E-mail: stevenpliu@hust.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
刘沛丰
钱璐
赵兴炜
陶波

引用本文:

刘沛丰,钱璐,赵兴炜,陶波. 航空装配领域中命名实体识别的持续学习框架[J]. 浙江大学学报(工学版), 2023, 57(6): 1186-1194.

Pei-feng LIU,Lu QIAN,Xing-wei ZHAO,Bo TAO. Continual learning framework of named entity recognition in aviation assembly domain. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1186-1194.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.06.014        https://www.zjujournals.com/eng/CN/Y2023/V57/I6/1186

模型 P/% R/% F1/%
CRF(2006)[16] 91.22 81.71 86.20
CRF(2013)[17] 91.86 88.75 90.28
BiLSTM+CRF[18] 92.97 90.80 91.87
Lattice LSTM[18] 93.57 92.79 93.18
BERT-BiGRU-CRF[19] 95.31 95.54 95.43
表 1  命名实体识别技术模型在MSRA语料集上的实验结果
航空装配本体 现有知识库本体
组件实体(component) 产品实体(product)
固定设施实体(facility) 固定平台实体(plant)
操作项目实体(operation) 工序实体(process)
工序步骤实体(step) 工序计划实体(process plan)
工具实体(tool) 工具实体(tool)
表 2  航空装配本体与现有知识库本体的对照表
图 1  航空装配领域中组件实体示例
图 2  人机交互工具界面
实体类别 nce
AA-1 AA-2 总计
组件实体(component) 336 213 549
固定设施实体(facility) 92 3 95
操作项目实体(operation) 263 91 354
工序步骤实体(step) 122 6 128
工具实体(tool) 139 18 157
表 3  语料库实体数量统计
图 3  持续学习命名实体识别框架
图 4  长短记忆模型单元的基本结构
图 5  基于字符的双向长短记忆模型基本结构
语料库 n 标注方式
训练集 验证集 测试集
MSRA 1 921 489 246 370 229 910 BIO
AA-1 9 364 621 665 BMEO
AA-2 13 068 709 653 BMEO
表 4  语料库信息统计
n F1/%
HMM CRF BiLSTM BiLSTM+CRF 本研究
5 468 89.20 87.71 87.25 87.24 89.20
10 437 89.49 88.20 87.33 87.31 89.49
20 201 90.44 89.05 87.49 88.80 90.44
40 061 91.44 90.63 90.66 90.59 91.66
80 095 92.74 92.79 91.28 92.50 93.05
1 921 489 94.82 98.05 97.51 98.02 98.21
表 5  不同模型在MSRA语料库上的F1值
图 6  在MSRA语料库上不同模型的F1值与训练集字数的变化曲线
模型 P/% R/% F1/%
HMM 94.76 94.90 94.82
CRF 98.03 98.07 98.05
BiLSTM 97.49 97.54 97.51
BiLSTM+CRF 98.01 98.04 98.02
本研究 98.19 98.23 98.21
表 6  不同模型在MSRA语料库上的实验结果
实体类别 nce,t
AA-1 AA-2
组件实体(component) 27 42
固定设施实体(facility) 6
操作项目实体(operation) 19 29
工序步骤实体(step) 10
工具实体(tool) 2 4
表 7  2种航空装配语料库在测试集中的实体数量
模型 P/% R/% F1/%
HMM 89.94 86.62 88.24
CRF 86.33 85.85 86.09
BiLSTM 76.35 78.46 77.39
BiLSTM+CRF 74.44 78.15 76.25
本研究 89.94 86.62 88.24
表 8  不同模型在AA-1语料库上的实验结果
实体类别 P/% R/% F1/%
组件实体(component) 75.86 81.48 78.57
固定设施实体(facility) 100 33.33 50.00
操作项目实体(operation) 69.23 94.74 80.00
工序步骤实体(step) 75.00 90.00 81.82
工具实体(tool) 12.50 50.00 20.00
表 9  持续学习框架在AA-1语料库5类实体上的实验结果
模型 P/% R/% F1/%
HMM 85.12 86.39 85.75
CRF 84.03 85.60 84.81
BiLSTM 79.98 82.75 81.34
BiLSTM+CRF 79.61 81.80 80.69
本研究 85.12 86.39 85.75
表 10  不同模型在AA-2语料库上的实验结果
实体类别 P/% R/% F1/%
组件实体(component) 83.33 71.43 76.77
操作项目实体(operation) 79.31 79.31 79.31
工具实体(tool) 66.67 50.00 57.14
表 11  持续学习框架在AA-2语料库3类实体上的实验结果
1 陈永佩, 杜震洪, 刘仁义, 等 一种引入实体的地理语义相似度混合计算模型[J]. 浙江大学学报: 理学版, 2018, 45 (2): 196- 204
CHEN Yong-pei, DU Zhen-hong, LIU Ren-yi, et al A hybrid geo-semantic similarity measurement model introducing geographic entities[J]. Journal of Zhejiang University: Science Edition, 2018, 45 (2): 196- 204
2 陈善雄, 王小龙, 韩旭, 等 一种基于深度学习的古彝文识别方法[J]. 浙江大学学报: 理学版, 2019, 46 (3): 261- 269
CHEN Shan-xiong, WANG Xiao-long, HAN Xu, et al A recognition method of Ancient Yi character based on deep learning[J]. Journal of Zhejiang University: Science Edition, 2019, 46 (3): 261- 269
3 张栋豪, 刘振宇, 郏维强, 等 知识图谱在智能制造领域的研究现状及其应用前景综述[J]. 机械工程学报, 2021, 57 (5): 90- 113
ZHANG Dong-hao, LIU Zhen-yu, JIA Wei-qiang, et al A review on knowledge graph and its application prospects to intelligent manufacturing[J]. Journal of Mechanical Engineering, 2021, 57 (5): 90- 113
doi: 10.3901/JME.2021.05.090
4 邱凌, 张安思, 李少波, 等 航空制造知识图谱构建研究综述[J]. 计算机应用研究, 2022, 39 (4): 968- 977
QIU Ling, ZHANG An-si, LI Shao-bo, et al Survey on building knowledge graphs for aerospace manufacturing[J]. Application Research of Computers, 2022, 39 (4): 968- 977
doi: 10.19734/j.issn.1001-3695.2021.09.0367
5 徐增林, 盛泳潘, 贺丽荣, 等 知识图谱技术综述[J]. 电子科技大学学报, 2016, 45 (4): 589- 606
XU Zeng-lin, SHENG Yong-pan, HE Li-rong, et al Review on knowledge graph techniques[J]. Journal of University of Electronic Science and Technology of China, 2016, 45 (4): 589- 606
6 杨贺羽. 基于深度学习的半监督式命名实体识别[D]. 沈阳: 沈阳工业大学, 2019.
YANG He-yu. Semi-supervised named entity recognition based on deep learning [D]. Shenyang: Shenyang University of Technology, 2019.
7 LI J, SUN A, HAN J, et al A survey on deep learning for named entity recognition[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34: 50- 70
doi: 10.1109/TKDE.2020.2981314
8 RING M B. Child: a first stop towards continual learning [M]// THRUN S, PRATT L. Learning to learn. New York: Springer, 1998 : 261-292
9 LEVOW G. The third international Chinese language processing bakeoff: word segmentation and named entity recognition [C]// Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Sydney: Association for Computational Linguistics, 2006: 108-117.
10 OCKER F, PAREDIS C J J, VOGEL-HEUSER B Applying knowledge bases to make factories smarter[J]. Automatisierungstechnik, 2019, 67 (6): 504- 517
doi: 10.1515/auto-2018-0138
11 肖勇, 郑楷洪, 王鑫, 等 基于联合神经网络学习的中文电力计量命名实体识别[J]. 浙江大学学报: 理学版, 2021, 48 (3): 321- 330
XIAO Yong, ZENG Kai-hong, WANG Xin, et al Chinese named entity recognition in electric power metering domain based on neural joint learning[J]. Journal of Zhejiang University: Science Edition, 2021, 48 (3): 321- 330
12 CAMASTRA F, VINCIARELLI A. Markovian models for sequential data [M]// CAMASTRA F, VINCIARELLI A. Machine learning for audio, image and video analysis. London: Springer, 2008: 265-303.
13 SUTTON C, MCCALLUM A. An introduction to conditional random fields [EB/OL]. (2010-11-17). https://arxiv.org/pdf/1011.4088.pdf.
14 HAMMERTON J. Named entity recognition with long short-term memory [C]// Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003. Stroudsburg: Association for Computational Linguistics, 2003: 172-175.
15 LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition [C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language. San Diego: Association for Computational Linguistics, 2016: 260–270.
16 CHEN A, PENG F, SHAN R, et al. Chinese named entity recognition with conditional probabilistic models [C]// Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Sydney: Association for Computational Linguistics, 2006: 173-176.
17 ZHOU J, QU W, FEN Z Chinese named entity recognition via joint identification and categorization[J]. Chinese Journal of Electronics, 2013, 22 (2): 225- 230
18 ZHANG Y, WANG Y, YANG J Lattice LSTM for Chinese sentence representation[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020, 28: 1506- 1519
doi: 10.1109/TASLP.2020.2991544
19 杨飘, 董文永 基于BERT嵌入的中文命名实体识别方法[J]. 计算机工程, 2020, 46 (4): 40- 45
YANG Piao, DONG Wen-yong Chinese named entity recognition method based on BERT embedding[J]. Computer Engineering, 2020, 46 (4): 40- 45
doi: 10.19678/j.issn.1000-3428.0054272
20 《航空制造工程手册》总编委会. 航空制造工程手册: 飞机装配[M]. 北京: 航空工业出版社, 2010: 589–625.
21 NAKAYAMA H, KUBO T, KAMURA J, et al. Doccano: text annotation tool for human [CP/DK]. (2022-05-19). https://github.com/doccano/doccano.
22 PENG N, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: Association for Computational Linguistics, 2015: 548–554.
23 彭春艳, 张晖, 包玲玉, 等 基于条件随机域的生物命名实体识别[J]. 计算机工程, 2009, 35 (22): 197- 199
PENG Chun-yan, ZHANG Hui, BAO Ling-yu, et al Biological named entity recognition based on conditional random fields[J]. Computer Engineering, 2009, 35 (22): 197- 199
doi: 10.3969/j.issn.1000-3428.2009.22.067
[1] 周欣磊,顾海挺,刘晶,许月萍,耿芳,王冲. 基于集成学习与深度学习的日供水量预测方法[J]. 浙江大学学报(工学版), 2023, 57(6): 1120-1127.
[2] 赵嘉墀,王天琪,曾丽芳,邵雪明. 基于GRU的扑翼非定常气动特性快速预测[J]. 浙江大学学报(工学版), 2023, 57(6): 1251-1256.
[3] 曹晓璐,卢富男,朱翔,翁立波,卢书芳,高飞. 基于草图的兼容性服装生成方法[J]. 浙江大学学报(工学版), 2023, 57(5): 939-947.
[4] 苏育挺,陆荣烜,张为. 基于注意力和自适应权重的车辆重识别算法[J]. 浙江大学学报(工学版), 2023, 57(4): 712-718.
[5] 马庆禄,鲁佳萍,唐小垚,段学锋. 改进YOLOv5s的公路隧道烟火检测方法[J]. 浙江大学学报(工学版), 2023, 57(4): 784-794.
[6] 曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[7] 兰欢,余建波. 基于深度学习三维成型的钢板表面缺陷检测[J]. 浙江大学学报(工学版), 2023, 57(3): 466-476.
[8] 曾菊香,王平辉,丁益东,兰林,蔡林熹,管晓宏. 面向节点分类的图神经网络节点嵌入增强模型[J]. 浙江大学学报(工学版), 2023, 57(2): 219-225.
[9] 鲁建厦,包秦,汤洪涛,邵益平,赵文彬. 无设备人体追踪系统的择优标签方法[J]. 浙江大学学报(工学版), 2023, 57(2): 415-425.
[10] 马骏驰,迪骁鑫,段宗涛,唐蕾. 程序表示学习综述[J]. 浙江大学学报(工学版), 2023, 57(1): 155-169.
[11] 叶晨,战洪飞,林颖俊,余军合,王瑞,钟武昌. 基于推理-情境感知激活模型的设计知识推荐[J]. 浙江大学学报(工学版), 2023, 57(1): 32-46.
[12] 刘近贞,陈飞,熊慧. 多尺度残差网络模型的开放式电阻抗成像算法[J]. 浙江大学学报(工学版), 2022, 56(9): 1789-1795.
[13] 王万良,王铁军,陈嘉诚,尤文波. 融合多尺度和多头注意力的医疗图像分割方法[J]. 浙江大学学报(工学版), 2022, 56(9): 1796-1805.
[14] 郝琨,王阔,王贝贝. 基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法[J]. 浙江大学学报(工学版), 2022, 56(8): 1622-1632.
[15] 夏杰锋,唐武勤,杨强. 光伏航拍红外图像的热斑自动检测方法[J]. 浙江大学学报(工学版), 2022, 56(8): 1640-1647.