In order to build an aviation assembly knowledge graph composed of assembly process information, assembly technology knowledge, related industry standards and internal connections of the three, a named entity recognition technology framework based on continual learning was proposed. The characteristic of the proposed framework was that it maintained high recognition performance throughout the progressive learning process from zero corpus to large-scale corpus, without relying on manual feature setting. A comparative performance experiment of the proposed framework was carried out in practical industrial scenarios, the experiment proceeded from general assembly and component assembly, and the manipulations of the pull rod and cable installation were regard as a specific experimental case. Experimental results show that the proposed framework is significantly better in accuracy, recall, and F1 value than previous algorithms, while handling different-scale corpus environments. And the credible results for named entity recognition tasks can be provided consistently by the proposed framework in the aviation assembly domain.
Pei-feng LIU,Lu QIAN,Xing-wei ZHAO,Bo TAO. Continual learning framework of named entity recognition in aviation assembly domain. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1186-1194.
Tab.1Experimental results of named entity recognition models on MSRA corpus
航空装配本体
现有知识库本体
组件实体(component)
产品实体(product)
固定设施实体(facility)
固定平台实体(plant)
操作项目实体(operation)
工序实体(process)
工序步骤实体(step)
工序计划实体(process plan)
工具实体(tool)
工具实体(tool)
Tab.2Comparison table of aviation assembly ontology and existing knowledge ontology
Fig.1Example of component entities in aviation assembly domain
Fig.2Human-computer interaction tool interface
实体类别
nce
AA-1
AA-2
总计
组件实体(component)
336
213
549
固定设施实体(facility)
92
3
95
操作项目实体(operation)
263
91
354
工序步骤实体(step)
122
6
128
工具实体(tool)
139
18
157
Tab.3Statistics of corpus entity count
Fig.3Continual learning framework for named entity recognition
Fig.4Basic structure of long-short term memory unit
Fig.5Basic structure of character-based bidirectional long-short term memory with conditional random field model
语料库
n
标注方式
训练集
验证集
测试集
MSRA
1 921 489
246 370
229 910
BIO
AA-1
9 364
621
665
BMEO
AA-2
13 068
709
653
BMEO
Tab.4Corpus information statistics
n
F1/%
HMM
CRF
BiLSTM
BiLSTM+CRF
本研究
5 468
89.20
87.71
87.25
87.24
89.20
10 437
89.49
88.20
87.33
87.31
89.49
20 201
90.44
89.05
87.49
88.80
90.44
40 061
91.44
90.63
90.66
90.59
91.66
80 095
92.74
92.79
91.28
92.50
93.05
1 921 489
94.82
98.05
97.51
98.02
98.21
Tab.5F1-score of different models on MSRA corpus
Fig.6Variation curves of F1 values and training word counts on MSRA corpus
模型
P/%
R/%
F1/%
HMM
94.76
94.90
94.82
CRF
98.03
98.07
98.05
BiLSTM
97.49
97.54
97.51
BiLSTM+CRF
98.01
98.04
98.02
本研究
98.19
98.23
98.21
Tab.6Experimental results of different models on MSRAcorpus
实体类别
nce,t
AA-1
AA-2
组件实体(component)
27
42
固定设施实体(facility)
6
—
操作项目实体(operation)
19
29
工序步骤实体(step)
10
—
工具实体(tool)
2
4
Tab.7Number of entities in test set for two aviation assembly corpuses
模型
P/%
R/%
F1/%
HMM
89.94
86.62
88.24
CRF
86.33
85.85
86.09
BiLSTM
76.35
78.46
77.39
BiLSTM+CRF
74.44
78.15
76.25
本研究
89.94
86.62
88.24
Tab.8Experimental results of different models on AA-1 corpus
实体类别
P/%
R/%
F1/%
组件实体(component)
75.86
81.48
78.57
固定设施实体(facility)
100
33.33
50.00
操作项目实体(operation)
69.23
94.74
80.00
工序步骤实体(step)
75.00
90.00
81.82
工具实体(tool)
12.50
50.00
20.00
Tab.9Experimental results of continual learning framework on five entities from AA-1 corpus
模型
P/%
R/%
F1/%
HMM
85.12
86.39
85.75
CRF
84.03
85.60
84.81
BiLSTM
79.98
82.75
81.34
BiLSTM+CRF
79.61
81.80
80.69
本研究
85.12
86.39
85.75
Tab.10Experimental results of different models on AA-2 corpus
实体类别
P/%
R/%
F1/%
组件实体(component)
83.33
71.43
76.77
操作项目实体(operation)
79.31
79.31
79.31
工具实体(tool)
66.67
50.00
57.14
Tab.11Experimental results of continual learning framework on three entities from AA-2 corpus
[1]
陈永佩, 杜震洪, 刘仁义, 等 一种引入实体的地理语义相似度混合计算模型[J]. 浙江大学学报: 理学版, 2018, 45 (2): 196- 204 CHEN Yong-pei, DU Zhen-hong, LIU Ren-yi, et al A hybrid geo-semantic similarity measurement model introducing geographic entities[J]. Journal of Zhejiang University: Science Edition, 2018, 45 (2): 196- 204
[2]
陈善雄, 王小龙, 韩旭, 等 一种基于深度学习的古彝文识别方法[J]. 浙江大学学报: 理学版, 2019, 46 (3): 261- 269 CHEN Shan-xiong, WANG Xiao-long, HAN Xu, et al A recognition method of Ancient Yi character based on deep learning[J]. Journal of Zhejiang University: Science Edition, 2019, 46 (3): 261- 269
[3]
张栋豪, 刘振宇, 郏维强, 等 知识图谱在智能制造领域的研究现状及其应用前景综述[J]. 机械工程学报, 2021, 57 (5): 90- 113 ZHANG Dong-hao, LIU Zhen-yu, JIA Wei-qiang, et al A review on knowledge graph and its application prospects to intelligent manufacturing[J]. Journal of Mechanical Engineering, 2021, 57 (5): 90- 113
doi: 10.3901/JME.2021.05.090
[4]
邱凌, 张安思, 李少波, 等 航空制造知识图谱构建研究综述[J]. 计算机应用研究, 2022, 39 (4): 968- 977 QIU Ling, ZHANG An-si, LI Shao-bo, et al Survey on building knowledge graphs for aerospace manufacturing[J]. Application Research of Computers, 2022, 39 (4): 968- 977
doi: 10.19734/j.issn.1001-3695.2021.09.0367
[5]
徐增林, 盛泳潘, 贺丽荣, 等 知识图谱技术综述[J]. 电子科技大学学报, 2016, 45 (4): 589- 606 XU Zeng-lin, SHENG Yong-pan, HE Li-rong, et al Review on knowledge graph techniques[J]. Journal of University of Electronic Science and Technology of China, 2016, 45 (4): 589- 606
[6]
杨贺羽. 基于深度学习的半监督式命名实体识别[D]. 沈阳: 沈阳工业大学, 2019. YANG He-yu. Semi-supervised named entity recognition based on deep learning [D]. Shenyang: Shenyang University of Technology, 2019.
[7]
LI J, SUN A, HAN J, et al A survey on deep learning for named entity recognition[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34: 50- 70
doi: 10.1109/TKDE.2020.2981314
[8]
RING M B. Child: a first stop towards continual learning [M]// THRUN S, PRATT L. Learning to learn. New York: Springer, 1998 : 261-292
[9]
LEVOW G. The third international Chinese language processing bakeoff: word segmentation and named entity recognition [C]// Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Sydney: Association for Computational Linguistics, 2006: 108-117.
[10]
OCKER F, PAREDIS C J J, VOGEL-HEUSER B Applying knowledge bases to make factories smarter[J]. Automatisierungstechnik, 2019, 67 (6): 504- 517
doi: 10.1515/auto-2018-0138
[11]
肖勇, 郑楷洪, 王鑫, 等 基于联合神经网络学习的中文电力计量命名实体识别[J]. 浙江大学学报: 理学版, 2021, 48 (3): 321- 330 XIAO Yong, ZENG Kai-hong, WANG Xin, et al Chinese named entity recognition in electric power metering domain based on neural joint learning[J]. Journal of Zhejiang University: Science Edition, 2021, 48 (3): 321- 330
[12]
CAMASTRA F, VINCIARELLI A. Markovian models for sequential data [M]// CAMASTRA F, VINCIARELLI A. Machine learning for audio, image and video analysis. London: Springer, 2008: 265-303.
[13]
SUTTON C, MCCALLUM A. An introduction to conditional random fields [EB/OL]. (2010-11-17). https://arxiv.org/pdf/1011.4088.pdf.
[14]
HAMMERTON J. Named entity recognition with long short-term memory [C]// Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003. Stroudsburg: Association for Computational Linguistics, 2003: 172-175.
[15]
LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition [C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language. San Diego: Association for Computational Linguistics, 2016: 260–270.
[16]
CHEN A, PENG F, SHAN R, et al. Chinese named entity recognition with conditional probabilistic models [C]// Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. Sydney: Association for Computational Linguistics, 2006: 173-176.
[17]
ZHOU J, QU W, FEN Z Chinese named entity recognition via joint identification and categorization[J]. Chinese Journal of Electronics, 2013, 22 (2): 225- 230
[18]
ZHANG Y, WANG Y, YANG J Lattice LSTM for Chinese sentence representation[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020, 28: 1506- 1519
doi: 10.1109/TASLP.2020.2991544
[19]
杨飘, 董文永 基于BERT嵌入的中文命名实体识别方法[J]. 计算机工程, 2020, 46 (4): 40- 45 YANG Piao, DONG Wen-yong Chinese named entity recognition method based on BERT embedding[J]. Computer Engineering, 2020, 46 (4): 40- 45
doi: 10.19678/j.issn.1000-3428.0054272
NAKAYAMA H, KUBO T, KAMURA J, et al. Doccano: text annotation tool for human [CP/DK]. (2022-05-19). https://github.com/doccano/doccano.
[22]
PENG N, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: Association for Computational Linguistics, 2015: 548–554.
[23]
彭春艳, 张晖, 包玲玉, 等 基于条件随机域的生物命名实体识别[J]. 计算机工程, 2009, 35 (22): 197- 199 PENG Chun-yan, ZHANG Hui, BAO Ling-yu, et al Biological named entity recognition based on conditional random fields[J]. Computer Engineering, 2009, 35 (22): 197- 199
doi: 10.3969/j.issn.1000-3428.2009.22.067
Jun-chi MA,Xiao-xin DI,Zong-tao DUAN,Lei TANG. Survey on program representation learning[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 155-169.