|
|
Dialogue generation model based on knowledge transfer and two-direction asynchronous sequence |
Yong-chao WANG1(),Yu CAO2,Yu-hui YANG1,Duan-qing XU2,*() |
1. Information Technology Center, Zhejiang University, Hangzhou 310027, China 2. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China |
|
|
Abstract A dialogue generation model based on knowledge transfer and two-direction asynchronous sequence generation was proposed, aiming to the generally meaningless safe replies and the problem of a large number of repetitive words in view of the end-to-end dialogue generation models, and the challenge of introducing external knowledge into the dialogue system. The external knowledge in the knowledge base was fused into the dialogue generation model and explicitly generated in the reply sentences. A pre-trained model based on the question and answering of the knowledge base was used to obtain the knowledge expressions of the input sentences, the knowledge expressions of the candidate answers, and keywords. The keywords were then used in the reply. Two encoder-decoder structure models were proposed, and the keywords were generated explicitly in the dialogue reply by two-direction asynchronous generation. The knowledge expressions and understanding capabilities of the pre-trained model were introduced to capture knowledge information to dialog generation at the encoding and decoding stages. A repetitive detection-penalty mechanism was proposed to reduce the repeated words problem by giving weight to punish the repetitive words. Experimental results show that the model outperforms better than existing methods in both automatic evaluation and manual evaluation indicators.
|
Received: 11 July 2021
Published: 29 March 2022
|
|
Fund: 国家重点研发计划资助项目(2020YFC1523101, 2019YFC1521304);浙江省重点研发计划资助项目(2021C03140);宁波市2021科技创新重大专项(20211ZDYF020028) |
Corresponding Authors:
Duan-qing XU
E-mail: ychwang@zju.edu.cn;xdq@cs.zju.edu.cn
|
基于知识迁移和双向异步序列的对话生成模型
针对端到端的对话生成模型普遍存在无意义安全回复和大量重复词汇的问题,和将外部知识引入对话系统的挑战,提出基于知识迁移和双向异步序列的对话生成模型.将知识库中的外部知识融合到对话生成模型并显式地生成在回复语句中;使用预训练的知识库问答模型获取输入语句的知识表达、候选知识表达以及关键字;搭建2个编码器?解码器结构,通过双向异步解码将关键字显式地生成在对话回复中;编、解码阶段均引入预训练模型的知识理解和知识表达能力,提升对话生成对知识信息的捕捉能力.提出重复检测惩罚机制,通过赋予惩罚权重的方式减少对话生成中的重复词汇.实验结果表明,所提模型在自动评估和人工评价指标上均优于已有的对话生成方法.
关键词:
对话生成,
知识实体,
知识库问答,
双向异步生成,
序列到序列模型
|
|
[1] |
FERGUSON G, ALLEN J, MILLER B. TRAINS-95: towards a mixed-initiative planning assistant [C]// Proceedings of the Third Conferece on Artificial Intelligence Planning Systems. Edinburgh: [s.n.], 1996: 70-77.
|
|
|
[2] |
GRAESSER A C, CHIPMAN P, HAYNES B C, et al AutoTutor: an intelligent tutoring system with mixed-initiative dialogue[J]. IEEE Transactions on Education, 2005, 48 (4): 612- 618
doi: 10.1109/TE.2005.856149
|
|
|
[3] |
WANG H, LU Z, LI H, et al. A dataset for research on short-text conversations [C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle: ACL, 2013: 935-945.
|
|
|
[4] |
RITTER A, CHERRY C, DOLAN W B. Data-driven response generation in social media [C]// Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Edinburgh: ACL, 2011: 583-593.
|
|
|
[5] |
VINYALS O, LE Q V. A neural conversational model [EB/OL]. [2021-06-22]. https://arxiv.org/pdf/1506.05869v1.pdf.
|
|
|
[6] |
SHANG L, LU Z, LI H. Neural responding machine for short-text conversation [C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing: ACL, 2015: 1577-1586.
|
|
|
[7] |
SERBAN I, SORDONI A, BENGIO Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models [C]// Proceedings of the AAAI Conference on Artificial Intelligence. Phoenix: AAAI, 2016: 3776-3783.
|
|
|
[8] |
LI J, GALLEY M, BROCKETT C, et al. A diversity-promoting objective function for neural conversation models [C]// The 2016 Conference of the North American Chapter of the Association for Computational Linguistics. San Diego: [s.n.]. 2016: 110-119.
|
|
|
[9] |
ZHOU H, YOUNG T, HUANG M, et al. Commonsense knowledge aware conversation generation with graph attention [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: [s.n.]. 2018: 4623-4629.
|
|
|
[10] |
LONG Y, WANG J, XU Z, et al. A knowledge enhanced generative conversational service agent [C]// Proceedings of the 6th Dialog System Technology Challenges (DSTC6) Workshop. Long Beach: [s.n.], 2017.
|
|
|
[11] |
GHAZVININEJAD M, BROCKETT C, CHANG M W, et al. A knowledge-grounded neural conversation model [EB/OL]. [2021-06-22]. https://arxiv.org/pdf/1702.01932.pdf.
|
|
|
[12] |
ZHU W, MO K, ZHANG Y, et al. Flexible end-to-end dialogue system for knowledge grounded conversation [EB/OL]. [2021-06-23]. https://arxiv.org/pdf/1709.04264v1.pdf.
|
|
|
[13] |
LIU S, CHEN H, REN Z, et al. Knowledge diffusion for neural dialogue generation [C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne: [s.n.], 2018: 1489-1498.
|
|
|
[14] |
LIAN R, XIE M, WANG F, et al. Learning to select knowledge for response generation in dialog systems [C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao: [s.n.], 2019: 5081.
|
|
|
[15] |
WU S, LI Y, ZHANG D, et al. Diverse and informative dialogue generation with context-specific commonsense knowledge awareness [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: [s.n.], 2020: 5811-5820.
|
|
|
[16] |
ZHOU S, RONG W, ZHANG J, et al. Topic-aware dialogue generation with two-hop based graph attention [C]// ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto: IEEE, 2021: 7428-7432.
|
|
|
[17] |
XU H, BAO J, ZHANG G. Dynamic knowledge graph-based dialogue generation with improved adversarial meta-learning [EB/OL]. [2021-06-17]. https://arxiv.org/ftp/arxiv/papers/2004/2004.08833.pdf.
|
|
|
[18] |
AUER S, BIZER C, KOBILAROV G, et al. DBpedia: a nucleus for a web of open data [M]// ABERER K, CHOI K-S, NOY N, et al. The semantic web. Heidelberg: Springer, 2007: 722-735.
|
|
|
[19] |
BOSSELUT A, LE BRAS R, CHOI Y. Dynamic neuro-symbolic knowledge graph construction for zero-shot commonsense question answering [C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Vancouver: AAAI, 2021: 4923-4931.
|
|
|
[20] |
WANG J, LIU J, BI W, et al. Improving knowledge-aware dialogue generation via knowledge base question answering [EB/OL]. [2021-06-22]. https://arxiv.org/pdf/1912.07491v1.pdf.
|
|
|
[21] |
CHAUDHURI D, RONY M R A H, LEHMANN J. Grounding dialogue systems via knowledge graph aware decoding with pre-trained transformers [C]// The Semantic Web 18th International Conference. Auckland: Springer, 2021: 323-339.
|
|
|
[22] |
王勇超, 杨英宝, 曹钰, 邢卫 基于对抗学习和全局知识信息的关系检测技术研究[J]. 计算机应用研究, 2021, 38 (5): 1327- 1330 WANG Yong-chao, YANG Ying-bao, CAO Yu Research on relationship detection technology based on adversarial learning and global knowledge information[J]. Computer Application Research, 2021, 38 (5): 1327- 1330
|
|
|
[23] |
MOU L, SONG Y, YAN R, et al. Sequence to backward and forward sequences: a content-introducing approach to generative short-text conversation [C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka: ACL, 2016: 3349-3358.
|
|
|
[24] |
YU M, YIN W, HASAN K S, et al. Improved neural relation detection for knowledge base question answering [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver: [s.n.], 2017: 571-581.
|
|
|
[25] |
GARDNER M W, DORLING S R Artificial neural networks (the multilayer perceptron): a review of applications in the atmospheric sciences[J]. Atmospheric Environment, 1998, 32 (14/15): 2627- 2636
|
|
|
[26] |
CHO K, MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014: 1724-1734.
|
|
|
[27] |
张全聪 略谈英语词的重复形式[J]. 英语知识, 1991, (1): 30 ZHANG Quan-cong On the repetitive forms of English words[J]. English knowledge, 1991, (1): 30
|
|
|
[28] |
BORDES A, USUNIER N, CHOPRA S, et al. Large-scale simple question answering with memory networks [EB/OL]. [2021-06-17]. https://arxiv.org/pdf/1506.02075.pdf.
|
|
|
[29] |
LI Y, SU H, SHEN X, et al. DailyDialog: a manually labelled multi-turn dialogue dataset [C]// Proceedings of the Eighth International Joint Conference on Natural Language Processing. Taipei: AFNLP, 2017: 986-995.
|
|
|
[30] |
CAI H, CHEN H, SONG Y, et al. Data manipulation: towards effective instance learning for neural dialogue generation via learning to augment and reweight [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: [s.n.], 2020: 6334-6343.
|
|
|
[31] |
KIM S E, LIM Y S, PARK S B Strong influence of responses in training dialogue response generator[J]. Applied Sciences, 2021, 11 (16): 7415
doi: 10.3390/app11167415
|
|
|
[32] |
PENNINGTON J, SOCHER R, MANNING C D. Glove: global vectors for word representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014: 1532-1543.
|
|
|
[33] |
KINGMA D P, BA J. Adam: a method for stochastic optimization [C]// Proceedings of the 3rd International Conference on Learning Representations. San Diego: [s.n.], 2015.
|
|
|
[34] |
SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks [EB/OL]. [2021-06-17]. https://arxiv.org/pdf/1409.3215.pdf.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|