Please wait a minute...
浙江大学学报(工学版)  2020, Vol. 54 Issue (7): 1264-1271    DOI: 10.3785/j.issn.1008-973X.2020.07.003
自动化技术、计算机技术     
基于深度学习的录音文本分类方法
张彦楠1(),黄小红1,马严1,*(),丛群2
1. 北京邮电大学 信息网络中心,北京 100876
2. 北京网瑞达科技有限公司,北京 100876
Method with recording text classification based on deep learning
Yan-nan ZHANG1(),Xiao-hong HUANG1,Yan MA1,*(),Qun CONG2
1. Information Network Center, Beijing University of Posts and Telecommunications, Beijing 100876, China
2. Beijing Wrdtech Limited Company, Beijing 100876, China
 全文: PDF(1048 KB)   HTML
摘要:

为了提高具有关联工单数据的录音文本的分类精确率,根据录音文本及关联数据的特点,设计基于深度学习的录音文本分类方法. 针对录音文本,通过双向词嵌入语言模型(ELMo)获得录音文本及工单信息的向量化表示,基于获取的词向量,利用卷积神经网络(CNN)挖掘句子局部特征;使用CNN分别挖掘工单标题和工单的描述信息,将CNN输出的特征进行加权拼接后,输入双向门限循环单元(GRU),捕捉句子上下文语义特征;引入注意力机制,对GRU隐藏层的输出状态赋予不同的权重. 实验结果表明,与已有算法相比,该分类方法的收敛速度快,具有更高的准确率.

关键词: 词向量卷积神经网络(CNN)双向门限循环单元注意力文本分类    
Abstract:

A classification method based on deep learning was designed according to the characteristics of recording text and correlation data in order to improve the classification precision of the recording text with associated work order data. The embedding of the recording text and work order information was obtained through the bidirectional word embedding language model (ELMo). Local features of the sentence were mined by using convolutional neural networks (CNN) based on the word embedding. Title and description information of the work order were separately mined by using CNN. Features extracted by CNN were concatenated with a weighting factor. Then weighted features were entered into bidirectional gated recurrent unit (GRU) in order to capture the semantic features of the context. The attention mechanism was introduced to assign different weights to the output state of the GRU hidden layer. The experimental results show that the classification method has faster convergence rate and higher accuracy compared with the existing algorithms.

Key words: word vector    convolutional neural networks (CNN)    bidirectional gated recurrent unit    attention    text classification
收稿日期: 2019-07-30 出版日期: 2020-07-05
CLC:  TP 391  
基金资助: 中央高校基本科研专项资金资助项目(2018RC53);国家CNGI专项资助项目(CNGI-12-03-001)
通讯作者: 马严     E-mail: knightzyn@163.com;mayan@bupt.edu.cn
作者简介: 张彦楠(1995—),女,硕士生,从事网络空间安全研究. orcid.org/0000-0003-2462-1760. E-mail: knightzyn@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
张彦楠
黄小红
马严
丛群

引用本文:

张彦楠,黄小红,马严,丛群. 基于深度学习的录音文本分类方法[J]. 浙江大学学报(工学版), 2020, 54(7): 1264-1271.

Yan-nan ZHANG,Xiao-hong HUANG,Yan MA,Qun CONG. Method with recording text classification based on deep learning. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1264-1271.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2020.07.003        http://www.zjujournals.com/eng/CN/Y2020/V54/I7/1264

图 1  录音文本分类模型示意图
图 2  ELMo模型结构图
图 3  句子级别CNN模型结构图
图 4  CNN特征加权拼接示意图
图 5  GRU结构图
图 6  Attention模型结构图
类别 训练集数 验证集数
网络故障报修 14 138 1 414
校园卡业务咨询 12 092 1 209
信息门户咨询 10 578 1 058
邮箱业务咨询 9 130 913
云盘业务咨询 8 259 826
正版软件使用 7 810 781
表 1  录音文本分类方法实验数据分布表
模型参数 参数取值 参数实验值
词向量维度 200 100,200,300
CNN卷积核尺寸 3 3,4,5
CNN卷积核数量 128 64,128,256
Epoch 25 10,15,20,25,30
Batch Size 128 64,128,256
随机失活率 0.5 0.4,0.5,0.6
表 2  录音文本分类模型神经网络参数取值表
真实类别 模型预测为正类 模型预测为负类
正类 TP FN
负类 FP TN
表 3  混淆矩阵
$\gamma $ P R
0.5 0.909 7 0.840 2
0.6 0.933 8 0.873 5
0.7 0.953 2 0.905 0
0.8 0.921 5 0.852 6
0.9 0.893 2 0.822 0
表 4  精确率、召回率与权重系数的关系
模型 P R
CNN 0.734 4 0.749 5
BiLSTM 0.873 2 0.762 3
CNN+BiLSTM 0.900 1 0.873 8
BiGRU-FCN 0.914 3 0.870 2
Attention-based C-GRU 0.933 9 0.884 0
本文模型 0.953 2 0.905 0
表 5  文本分类方法的对比实验结果统计表
1 GAO J, GALLEY M, LI L Neural approaches to conversational AI[J]. Foundations and Trends? in Information Retrieval, 2019, 13 (2/3): 127- 298
2 ZHOU Y, LI C, HE S, et al. Pre-trained contextualized representation for Chinese conversation topic classification [C] // IEEE International Conference on Intelligence and Security Informatics. Shenzhen: IEEE, 2019: 122-127.
3 SUN B, TIAN F, LIANG L. Tibetan micro-blog sentiment analysis based on mixed deep learning [C] // International Conference on Audio, Language and Image Processing. Shanghai: ICALIP, 2018: 109-112.
4 龚媛. 基于自然语言处理的语音识别后文本处理[D]. 北京: 北京邮电大学, 2008.
GONG Yuan. Text correction for ASR result on the platform of intelligent mobile phone [D]. Beijing: Beijing University of Posts and Telecommunications, 2008.
5 刘艺彬. 基于分词频的特征选择算法在文本分类中的研究[D]. 西安: 西安理工大学, 2018.
LIU Yi-bin. Research on feature selection algorithm based on segmented term frequency in text classification [D]. Xi’an: Xi’an University of Technology, 2018.
6 EZZAT S, EL GAYAR N, GHANEM M M Sentiment analysis of call centre audio conversations using text classification[J]. International Journal of Computer Information Systems and Industrial Management Applications, 2012, 4 (1): 619- 627
7 宋鲜艳. 基于循环神经网络的口语语义理解研究[D]. 武汉: 华中科技大学, 2018.
SONG Xian-yan. A thesis submitted in partial fulfillment of the requirements for the degree for the master of engineering [D]. Wuhan: Huazhong University of Science and Technology, 2018.
8 MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C] // Advances in Neural Information Processing Systems. Nevada: NIPS, 2013: 3111-3119.
9 MATTHEW E P, MARK N, MOHIT I, et al. Deep contextualized word representations [C] // Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans: ACL, 2018: 2227-2237.
10 KIM Y. Convolutional neural networks for sentence classification [C] // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1746-1751.
11 LIU P, QIU X, HUANG X, et al. Recurrent neural network for text classification with multi-task learning [C] // Proceedings of the 25th International Joint Conferences on Artificial Intelligence. New York: AAAI Press, 2016: 2873-2879.
12 ATHIWARATKUN B, STOKES J W. Malware classification with LSTM and GRU language models and a character-level CNN [C] // 2017 IEEE International Conference on Acoustics, Speech and Signal Processing. New Orleans: IEEE, 2017: 2482-2486.
13 LIANG X, LIU Z, OUYANG C. A multi-sentiment classifier based on GRU and attention mechanism [C] // 2018 IEEE 9th International Conference on Software Engineering and Service Science. Beijing: IEEE, 2018: 527-530.
14 LYU L, HAN T. A comparative study of Chinese patent literature automatic classification based on deep learning [C] // 2019 ACM/IEEE Joint Conference on Digital Libraries. Champaign: IEEE, 2019: 345-346.
15 WANG B. Disconnected recurrent neural networks for text categorization [C] // Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne: ACL, 2018: 2311-2320.
16 哈工大停用词表[EB/OL]. [2019-12-18]. https://github.com/goto456/stopwords.
17 任勉, 甘刚 基于双向LSTM模型的文本情感分类[J]. 计算机工程与设计, 2018, 39 (7): 2064- 2068
REN Mian, GAN Gang Sentiment analysis of text based on bi-directional long short-term memory model[J]. Computer Engineering and Design, 2018, 39 (7): 2064- 2068
18 TANG D, QIN B, LIU T. Document modeling with gated recurrent neural network for sentiment classification [C] // Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: ACL, 2015: 1422-1432.
19 张国豪, 刘波 采用CNN和Bidirectional GRU的时间序列分类研究[J]. 计算机科学与探索, 2019, 13 (6): 916- 927
ZHANG Guo-hao, LIU Bo Research on time series classification using CNN and bidirectional GRU[J]. Journal of Frontiers of Computer Science and Technology, 2019, 13 (6): 916- 927
doi: 10.3778/j.issn.1673-9418.1812059
20 杨东, 王移芝 基于Attention-based C-GRU神经网络的文本分类[J]. 计算机与现代化, 2018, 34 (2): 96- 100
YANG Dong, WANG Yi-zhi An Attention-based C-GRU neural network for text classification[J]. Computer and Modernization, 2018, 34 (2): 96- 100
doi: 10.3969/j.issn.1006-2475.2018.02.020
[1] 宋鹏,杨德东,李畅,郭畅. 整体特征通道识别的自适应孪生网络跟踪算法[J]. 浙江大学学报(工学版), 2021, 55(5): 966-975.
[2] 雍子叶,郭继昌,李重仪. 融入注意力机制的弱监督水下图像增强算法[J]. 浙江大学学报(工学版), 2021, 55(3): 555-562.
[3] 马一凡,赵凡宇,王鑫,金仲和. 基于改进指针网络的卫星对地观测任务规划方法[J]. 浙江大学学报(工学版), 2021, 55(2): 395-401.
[4] 陈巧红,陈翊,李文书,贾宇波. 多尺度SE-Xception服装图像分类[J]. 浙江大学学报(工学版), 2020, 54(9): 1727-1735.
[5] 刘创,梁军. 基于注意力机制的车辆运动轨迹预测[J]. 浙江大学学报(工学版), 2020, 54(6): 1156-1163.
[6] 张岩,郭斌,王倩茹,张靖,於志文. SeqRec:基于长期偏好和即时兴趣的序列推荐模型[J]. 浙江大学学报(工学版), 2020, 54(6): 1177-1184.
[7] 杨萍,王丹,康子健,李童,付利华,余悦任. 基于模式识别和集成CNN-LSTM的阵发性房颤预测模型[J]. 浙江大学学报(工学版), 2020, 54(5): 1039-1048.
[8] 胡云青,邱清盈,余秀,武建伟. 基于改进三体训练法的半监督专利文本分类方法[J]. 浙江大学学报(工学版), 2020, 54(2): 331-339.
[9] 梁栋,刘昕宇,潘家兴,孙涵,周文俊,金子俊一. 动态背景下基于自更新像素共现的前景分割[J]. 浙江大学学报(工学版), 2020, 54(12): 2405-2413.
[10] 贾子钰,林友芳,张宏钧,王晶. 基于深度卷积神经网络的睡眠分期模型[J]. 浙江大学学报(工学版), 2020, 54(10): 1899-1905.
[11] 李红光,郭英,眭萍,齐子森. 基于时频特征的卷积神经网络跳频调制识别[J]. 浙江大学学报(工学版), 2020, 54(10): 1945-1954.
[12] 叶刚,李毅波,马逐曦,成杰. 基于ViBe的端到端铝带表面缺陷检测识别方法[J]. 浙江大学学报(工学版), 2020, 54(10): 1906-1914.
[13] 赵小虎,尹良飞,赵成龙. 基于全局?局部特征和自适应注意力机制的图像语义描述算法[J]. 浙江大学学报(工学版), 2020, 54(1): 126-134.
[14] 吕艳,张萌,姜吴昊,倪益华,钱小鸿. 采用卷积神经网络的老年人跌倒检测系统设计[J]. 浙江大学学报(工学版), 2019, 53(6): 1130-1138.
[15] 赫贵然,李奇,冯华君,徐之海,陈跃庭. 基于CNN特征提取的双焦相机连续数字变焦[J]. 浙江大学学报(工学版), 2019, 53(6): 1182-1189.