Please wait a minute...
浙江大学学报(理学版)  2021, Vol. 48 Issue (3): 321-330    DOI: 10.3785/j.issn.1008-9497.2021.03.008
数学与计算机科学     
基于联合神经网络学习的中文电力计量命名实体识别
肖勇1, 郑楷洪1, 王鑫2, 钱斌1, 孙凌云2
1.南方电网科学研究院有限责任公司,广东 广州 510663
2.浙江大学 计算机学院,浙江 杭州 310058
Chinese named entity recognition in electric power metering domain based on neural joint learning
XIAO Yong1, ZHENG Kaihong1, WANG Xin2, QIAN Bin1, SUN Lingyun2
1.Electric Power Research Institute, China Southern Power Grid, Guangzhou 510663, China
2.School of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China
 全文: PDF(2613 KB)   HTML  
摘要: 随着电力计量业务的不断扩展,迫切需要由业务信息、技术知识、行业标准及其内在联系所组成的电力计量知识图谱,为电网的决策和发展提供更为全面有效的支持。命名实体识别是构建知识图谱的基础。针对电力计量领域需要,结合中文分词技术特点,基于联合学习思想,提出了一种基于联合学习的中文电力计量命名实体识别技术。该技术联合CNN-BLSTM-CRF模型与整合词典知识的分词模型,使其共享实体类别和置信度;同时将2个模型的先后计算顺序改为并行计算,减少了识别误差累积。结果表明,在不需要人工构建特征的情况下,方法的正确率、召回率、F值等均显著优于以往方法。
关键词: 电力计量联合学习命名实体识别分词    
Abstract: While the business of electric power metering is expanding,it is urgent to build an electric power metering knowledge graph composed of business information,technical knowledge,industry standards and their internal connections to provide more comprehensive and effective support for the decision-making and development of power grid.Named entity recognition (NER) is a fundamental task for building knowledge graph.This paper proposes an entity recognition method based on a joint learning model which considers the feature of Chinese word segmentation and ideas of multi-task learning in the electric power metering domain.The neural CNN-BLSTM-CRF model and the Chinese word segmentation model with dictionary knowledge are jointly trained to build an unified named entity recognition model which shares the entity types and confidence and changes the computing order from serial to parallel to decrease the error accumulation.The experimental results show that the proposed method is obviously better than previous methods in precision,recall rate and F-score without the need of artificial feature construction.
Key words: power metering    named entity recognition    word segmentation    joint learning
收稿日期: 2020-04-20 出版日期: 2021-05-20
CLC:  TP 391.1  
基金资助: 南方电网公司科技项目(ZBKJXM20180157);国家自然科学基金资助项目(U1866602,61772456,61672451);浙江省重点研发计划项目(2019C03137);科技创新2030重大项目(2018AAA0100703).
通讯作者: ORCID:https://orcid.org/0000-0002-4134-0613,E-mail:wangxin2009@zju.edu.cn.     E-mail: wangxin2009@zju.edu.cn
作者简介: 肖勇(1979—),ORCID:https://orcid.org/0000-0001-6109-4512,男,博士,教授级高工,主要从事电能计量智能技术研;
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
肖勇
郑楷洪
王鑫
钱斌
孙凌云

引用本文:

肖勇, 郑楷洪, 王鑫, 钱斌, 孙凌云. 基于联合神经网络学习的中文电力计量命名实体识别[J]. 浙江大学学报(理学版), 2021, 48(3): 321-330.

XIAO Yong, ZHENG Kaihong, WANG Xin, QIAN Bin, SUN Lingyun. Chinese named entity recognition in electric power metering domain based on neural joint learning. Journal of Zhejiang University (Science Edition), 2021, 48(3): 321-330.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2021.03.008        https://www.zjujournals.com/sci/CN/Y2021/V48/I3/321

1 陈永佩,杜震洪,刘仁义,等. 一种引入实体的地理语义相似度混合计算模型[J]. 浙江大学学报(理学版),2018,45(2):196-204. DOI:10.3785/j.issn.1008-9497.2018.02.010 CHEN Y P,DU Z H,LIU R Y,et al. A hybrid geo-semantic similarity measurement model introducing geographic entities[J]. Journal of Zhejiang University(Science Edition),2018,45(2):196-204. DOI:10.3785/j.issn.1008-9497.2018.02.010
2 陈善雄,王小龙,韩旭,等. 一种基于深度学习的古彝文识别方法[J]. 浙江大学学报(理学版),2019,46(3):261-269. CHEN S X,WANG X L,HAN X,et al. A recognition method of Ancient Yi character based on deep learning[J]. Journal of Zhejiang University(Science Edition),2019,46(3):261-269.
3 李金湖,陈坤. 构建基于图数据库的电力知识图谱[C]//第二届智能电网会议论文集.北京:中国电力科学研究院有限公司/国网电投(北京)科技中心/《计算机科学与探索》杂志社,2018:77-81. DOI:10.1201/9781315275444-11 LI J H,CHEN K. Building power knowledge map based on graph database[C]//Proceedings of the Second Smart Grid Conference.Beijing:China Electric Power Research Institute Co.,Ltd./State Grid Power Investment (Beijing) Technology Center/"Computer Science and Exploration" Magazine,2018:77-81. DOI:10.1201/9781315275444-11
4 樊华,黄海潮,王鑫,等. 基于语义标注的电网文本数据知识抽取技术研究[C]//第三届智能电网会议论文集.北京:中国电力科学研究院有限公司/国网电投(北京)科技中心/《计算机科学与探索》杂志社,2018,12:140-144. FAN H,HUANG H C,WANG X,et al. Research on knowledge extraction technology of grid text data based on semantic annotation[C]//Proceedings of the Third Smart Grid Conference.Beijing:China Electric Power Research Institute Co.,Ltd./State Grid Power Investment (Beijing) Technology Center/"Computer Science and Exploration" Magazine,2018:140-144.
5 杨贺羽. 基于深度学习的半监督式命名实体识别[D]. 沈阳:沈阳工业大学,2019. DOI:10.21661/r-497961 YANG H Y.Semi-supervised Named Entity Recognition Based on Deep Learning[D].Shenyang:Shenyang University of Technology,2019. DOI:10.21661/r-497961
6 LI J,SUN A X,HAN J,et al. A survey on deep learning for named entity recognition[J]. IEEE Transactions on Knowledge and Data Engineering,2020:1-20. DOI:10.1109/tkde.2020.2981314
7 LUO G,HUANG X J,LIN C Y,et al. Joint named entity recognition and disambiguation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon:Association for Computational Linguistics,2015:879-888.
8 MORWAL S. Named entity recognition using hidden markov model (HMM)[J]. International Journal on Natural Language Computing,2012,1(4):15-23. DOI:10.5121/ijnlc.2012.1402
9 ARCHANA G,VISHAL G,MANISH K. Recent named entity recognition and classification techniques:A systematic review[J]. Computer Science Review,2018,29:21-43. DOI:10.1016/j.cosrev.2018.06.001
10 LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al. Neural architectures for named entity recognition[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.San Diego:Association for Computational Linguistics,2016:260-270.
11 CHIU J P C,NICHOLS E. Named entity recognition with bi-directional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics,2016,4:357-370. DOI:10.1162/tacl_a_00104
12 MA X Z,HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin:Association for Computational Linguistics,2016:1064-1074.
13 WU Y F,WEI X,QIN Y B,et al. A radical-based method for Chinese named entity recognition[C]//International Conference on Big Data.Los Angeles:IEEE,2019:125-130.
14 XU C W,WANG F Y,HAN J L,et al. Exploiting multiple embeddings for Chinese named entity recognition[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.Beijing:Association for Computing Machinery,2019:2269-2272.
15 DONG C H,ZHANG J J,ZONG C Q,et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//The Fifth National Conference on Natural Language Processing and Chinese Computing & The Twenty Fourth International Conference on Computer Processing of Oriental Languages.Kunming:China Computer Federation,2016:239-250.
16 YANG F,ZHANG J H,LIU G S,et al. Five-stroke based CNN-BiRNN-CRF network for Chinese named entity recognition[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Hohhot:China Computer Federation,2018:184-195.
17 LIN B Y,XU F F,LUO Z,et al. Multi-channel BiLSTM-CRF model for emerging named entity recognition in social media[C]//Proceedings of the 3rd Workshop on Noisy User-Generated Text.Copenhagen:Association for Computational Linguistics,2017:160-165.
18 DONG C H,WU H J,ZHANG J J,et al. Multichannel LSTM-CRF for named entity recognition in chinese social media[C]//China National Conference on Chinese Computational Linguistics & International Symposium on Natural Language Processing Based on Naturally Annotated Big Data.Nanjing;Association for Computational Linguistics,2017:197-208.
19 XU J J,HE H F,SUN X,et al. Cross-domain and semi supervised named entity recognition in Chinese social media:A unified model[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing,2018,26(11):2142-2152. DOI:10.1109/taslp.2018.2856625
20 WU Y H,JIANG M,LEI J B,et al. Named entity recognition in Chinese clinical text using deep neural network[J]. Studies in Health Technology and Informatics,2015,216:624-628.
21 韩普,刘亦卓,李晓艳. 基于深度学习和多特征融合的中文电子病历实体识别研究[J]. 南京大学学报(自然科学),2019,55(6):942-951. HAN P,LIU Y Z,LI X Y. Named entity recognition from chinese medical records based on deep learning and multi-feature fusion[J]. Journal of Nanjing University (Natural Science),2019,55(6):942-951.
22 李丽双,郭元凯. 基于CNN-BLSTM-CRF模型的生物医学命名实体识别[J]. 中文信息学报,2018,32(1):116-122. LI L S,GUO Y K. Biomedical named entity recognition based with CNN-BLSTM-CRF model[J]. Chinese Journal of Information,2018,32(1):116-122.
23 冯鸾鸾,李军辉,李培峰,等. 面向国防科技领域的技术和术语识别方法研究[J]. 计算机科学,2019,46(12):231-236. FENG L L,LI J H,LI P F,et al. Technology and terminology detection oriented national defense science[J]. Computer Science,2019,46(12):231-236.
24 尹学振,赵慧,赵俊保,等. 多神经网络协作的军事领域命名实体识别[J]. 清华大学学报(自然科学版),2020,60(8):648-655. YIN X Z,ZHAO H,ZHAO J B,et al. Multi-neural network collaboration for Chinese military named entity recognition[J]. Journal of Tsinghua University(Science and Technology),2020,60(8):648-655.
25 ZHAO Z Q,CHEN Z Y,LIU J B,et al. Chinese named entity recognition in power domain based on Bi-LSTM-CRF[C]//International Conference on Artificial Intelligence and Pattern Recognition.Beijing:AIPR,2019:176-180. DOI:10.1145/3357254.3357283
26 GIANNIS B,JOHANNES D,THOMAS D,et al.Joint entity recognition and relation extraction as a multi-head selection problem[J].Expert Systems with Application,2018,114(11):34-45.
27 PENG N,DREDZE M. Multi-task domain adaptation for sequence tagging[C]//Proceedings of the 2nd Workshop on Representation Learning for NLP.Vancouver:Association for Computational Linguistics,2017:91-100.
28 NGUYEN D B,THEOBALD M,WEIKUM G. J-NERD:Joint named entity recognition and disambiguation with rich linguistic features[J].Transactions of the Association for Computational Linguistics,2016,4:215-229. DOI:10.1162/tacl_a_00094
29 PASCANU R,MIKOLOV T,BENGIO Y,et al. On the Difficulty of Training Recurrent Neural Networks[Z/EB]. http://proceedings.mlr.press/v28/pascanu13.pdf?spm=5176.100239.blogcont292826.13.57KVN0&file=pascanu13.pdf. DOI:10.1007/s120 88-011-0245-8
30 杜振华. 新英汉·汉英电力工程技术词典[M].北京:中国电力出版社,2013. DOI:10.1007/s12088-011-0245-8 DU Z H. New English to Chinese·Chinese to English Dictionary of Power Engineering Technology[M]. Beijing:China Electric Power Press,2013. DOI:10.1007/s12088-011-0245-8
31 PENG N,DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon:Association for Computational Linguistics,2015:548-554.
No related articles found!