Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (5): 956-966    DOI: 10.3785/j.issn.1008-973X.2022.05.013
计算机与控制工程     
基于图卷积网络的归纳式微博谣言检测新方法
王友卫1(),童爽1,凤丽洲2,朱建明1,李洋1,陈福1
1. 中央财经大学 信息学院,北京 100081
2. 天津财经大学 统计学院,天津 300222
New inductive microblog rumor detection method based on graph convolutional network
You-wei WANG1(),Shuang TONG1,Li-zhou FENG2,Jian-ming ZHU1,Yang LI1,Fu CHEN1
1. School of Information, Central University of Finance and Economics, Beijing 100081, China
2. School of Statistics, Tianjin University of Finance and Economics, Tianjin 300222, China
 全文: PDF(1246 KB)   HTML
摘要:

为了解决传统图卷积神经网络在进行谣言检测时面临的未充分考虑单词语义信息以及池化方法选择困难的问题,提出基于图卷积网络(GCN)的归纳式微博谣言检测新方法. 考虑单词之间的语义关系,结合传统词共现建图方法提出基于词语义相关性的微博事件建图方法,并结合图卷积网络和门循环单元(GRU)实现节点信息聚合;为了有效融合不同节点状态的特征信息,提出基于注意力机制的多池化方法融合策略融合最大池、平均池和全局池以获取最终的图级向量;为了提高微博谣言检测效率,探究微博评论时间对检测结果的影响,获得用于模型训练的最佳评论利用时间阈值. 实验结果表明,本研究方法在给定数据集上的表现普遍优于Text-CNN、Bi-GCN、TextING等典型方法,验证了其在微博谣言检测领域的有效性.

关键词: 谣言检测图卷积网络微博事件门循环单元注意力机制    
Abstract:

A new inductive microblog rumor detection method based on graph convolutional networks (GCN) was proposed to solve the problems faced by traditional GCN in rumor detection, such as the insufficient consideration of word semantic information and the difficulty of selecting pooling methods. Firstly, the semantic relationship between words was considered. A microblog event graph construction method based on word semantic correlation was proposed by combining the traditional word co-occurrence based graph construction method, and the node information aggregation was realized by combining GCN and gate recurrent unit (GRU). Then, in order to effectively fuse the feature information of different nodes, a multiple pooling methods fusion strategy based on attention mechanism, which fused max-pooling, average-pooling and global-pooling, was proposed to obtain the final graph level vector. Finally, in order to improve the efficiency of microblog rumor detection, the influence of microblog comment time on detection results was explored, and the best comment utilization time threshold for model training was obtained. Experimental results show that the performance of the proposed method is generally better than that of Text-CNN, Bi-GCN, TextING and other typical methods on the given datasets, verifying its effectiveness in the field of microblog rumor detection.

Key words: rumor detection    graph convolutional network    microblog event    gate recurrent unit    attention mechanism
收稿日期: 2021-11-14 出版日期: 2022-05-31
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(61906220); 教育部人文社科资助项目(19YJCZH178); 国家社科基金资助项目(18CTJ008); 天津市自然科学基金资助项目(18JCQNJC69600); 内蒙古纪检监察大数据实验室2020-2021年度开放课题资助项目(IMDBD202002, IMDBD202004); 中央财经大学新兴交叉学科建设项目;中国高校产学研创新基金项目(2021FNA01002)
作者简介: 王友卫(1987—),男,副教授,博士,从事机器学习、数据挖掘研究. orcid.org/0000-0002-3925-3422. E-mail: ywwang15@126.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
王友卫
童爽
凤丽洲
朱建明
李洋
陈福

引用本文:

王友卫,童爽,凤丽洲,朱建明,李洋,陈福. 基于图卷积网络的归纳式微博谣言检测新方法[J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.

You-wei WANG,Shuang TONG,Li-zhou FENG,Jian-ming ZHU,Yang LI,Fu CHEN. New inductive microblog rumor detection method based on graph convolutional network. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 956-966.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.05.013        https://www.zjujournals.com/eng/CN/Y2022/V56/I5/956

图 1  微博事件举例
图 2  归纳式图卷积网络流程图
图 3  文献[20]建图方法面临问题举例
数据集 nu ne nr nt nc
Ma_Dataset 2 746 818 4 664 2 351 2 313 3 805 656
Song_Dataset 1 067 410 3 387 1 838 1 849 1 275 180
表 1  数据集的详细信息
tm/h 典型评论内容
1 这是真的么?
真的?
2 真噶?![吃惊][哈哈]
真的还是假的现在醒着还是醉了[围观]
3 真的还是假的,咋没新闻?
假的吧
4 假的吧P的吧
[汗] 真的?
5 真的吗,求真相
真葛..真葛..?
6 真的吗?~小综!
真的吗
7 真的假的[思考]
求真相···
8 真的的还是假的?震惊~
这是骗我的吧?
9 不是吧[抓狂]
是假的是吗
10 啥?真的假的?
这么假也有人信
表 2  某源微博及其发出10 h内的相关评论
图 4  源微博发出0~10 h内不同时间点的谣言检测准确率
本研究建图方法分类 th 本研究建图方法分类 th
WR-1 0.95 WR-4 0.80
WR-2 0.90 WR-5 0.75
WR-3 0.85 WR-6 0.70
表 3  阈值不同时对应的本研究建图方法分类
图 5  不同建图方法的准确性比较
图 6  不同池化方法的准确性比较
对比方法 实验设定
DT-Rank[4] 所选特征包括来源可信度、来源身份、来源多样性、来源地址、语言态度、事件传播特征,特征选择方法为信息增益.
SVM-TS[3] 所选特征为内容特征、用户特征和传播特征,核函数为RBF.
Text-CNN[23] 卷积核尺寸分别等于3、4、5,卷积核数量为256.
GRU-2[8] GRU层数为2,词典大小为5 000.
dEFEND[24] 注意力层维度为100,共注意力层潜在维度为200.
Text-GCN[18, 22] GCN层数为2.
Bi-GCN[25] 模型早停忍耐批次为10.
GLAN[26] 卷积核尺寸分别等于3、4、5,卷积核数量为100.
TextING[20] 滑动窗口大小为3.
表 4  不同对比方法参数设置
方法 Ma_Dataset Song_Dataset
Acc Pre Rec F1 Acc Pre Rec F1
DT-Rank 0.727 0.736 0.731 0.733 0.653 0.637 0.665 0.651
SVM-TS 0.829 0.814 0.823 0.818 0.746 0.751 0.761 0.756
Text-CNN 0.848 0.839 0.854 0.846 0.801 0.807 0.812 0.809
GRU-2 0.902 0.895 0.891 0.893 0.842 0.837 0.846 0.841
dEFEND 0.917 0.912 0.929 0.920 0.881 0.873 0.898 0.885
Text-GCN 0.924 0.915 0.919 0.917 0.889 0.892 0.885 0.888
Bi-GCN 0.929 0.931 0.924 0.927 0.901 0.897 0.906 0.901
GLAN 0.930 0.935 0.932 0.933 0.903 0.908 0.912 0.910
TextING 0.938 0.937 0.943 0.940 0.912 0.906 0.915 0.910
本研究方法 0.946 0.939 0.943 0.941 0.923 0.925 0.922 0.923
表 5  本研究所提方法与现有典型方法的微博谣言检测结果对比
方法 Ma_Dataset Song_Dataset
Acc Pre Rec F1 Acc Pre Rec F1
DT-Rank 0.723 0.728 0.725 0.726 0.647 0.635 0.669 0.652
SVM-TS 0.824 0.810 0.817 0.813 0.743 0.753 0.764 0.758
Text-CNN 0.839 0.833 0.849 0.841 0.800 0.813 0.809 0.811
GRU-2 0.899 0.896 0.885 0.890 0.839 0.835 0.847 0.841
dEFEND 0.915 0.913 0.931 0.922 0.877 0.869 0.899 0.883
Text-GCN 0.925 0.916 0.913 0.914 0.892 0.887 0.880 0.883
Bi-GCN 0.928 0.933 0.921 0.927 0.902 0.895 0.911 0.903
GLAN 0.929 0.936 0.930 0.933 0.902 0.907 0.916 0.911
TextING 0.937 0.936 0.939 0.937 0.909 0.908 0.911 0.909
本研究方法 0.945 0.938 0.941 0.940 0.921 0.920 0.925 0.923
表 6  最优评论利用时间阈值有效性验证
1 ZUBIAGA A, AKER A, BONTCHEVA K, et al Detection and resolution of rumours in social media: a survey[J]. ACM Computing Surveys (CSUR), 2018, 51 (2): 1- 36
2 新浪微博虚假消息辟谣官方账号. 2020年度微博辟谣数据报告[EB/OL]. (2020-02-07) [2021-11-05]. https://weibo.com/1866405545/K0QaImwsK.
3 MA J, GAO W, WEI Z, et al. Detect rumors using time series of social context information on microblogging websites [C]// Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. Melbourne: CIKM, 2015.
4 ZHAO Z, RESNICK P, MEI Q. Enquiring minds: early detection of rumors in social media from enquiry posts [C]// Proceedings of the 24th International Conference on World Wide Web. New York: WWW, 2015.
5 张仰森, 彭媛媛, 段宇翔, 等 基于评论异常度的新浪微博谣言识别方法[J]. 自动化学报, 2020, 46 (8): 1689- 1702
ZHANG Yang-sen, PENG Yuan-yuan, DUAN Yu-xiang, et al The method of Sina Weibo rumor detecting based on comment abnormality[J]. Acta Automatica Sinica, 2020, 46 (8): 1689- 1702
6 曾子明, 王婧 基于LDA和随机森林的微博谣言识别研究: 以2016年雾霾谣言为例[J]. 情报学报, 2019, 38 (1): 89- 96
ZENG Zi-ming, WANG Jing Research on Microblog rumor identification based on LDA and random forest[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38 (1): 89- 96
doi: 10.3772/j.issn.1000-0135.2019.01.010
7 CAI G, BI M, LIU J. A novel rumor detection method based on labeled cascade propagation tree [C]// Proceedings of the 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery. Guilin: ICNC-FSKD, 2017.
8 MA J, GAO W, MITRA P, et al. Detecting rumors from microblogs with recurrent neural networks [C]// International Joint Conference on Artificial Intelligence. New York: IJCAI, 2016.
9 WANG Z, GUO Y, WANG J, et al Rumor events detection from chinese microblogs via sentiments enhancement[J]. IEEE Access, 2019, 7: 103000- 103018
doi: 10.1109/ACCESS.2019.2928044
10 尹鹏博, 潘伟民, 彭成, 等 基于用户特征分析的微博谣言早期检测研究[J]. 情报杂志, 2020, 39 (7): 81- 86
YIN Peng-bo, PAN Wei-min, PENG Cheng, et al Research on early detection of Weibo rumors based on user characteristics analysis[J]. Journal of Intelligence, 2020, 39 (7): 81- 86
doi: 10.3969/j.issn.1002-1965.2020.07.014
11 SONG C, YANG C, CHEN H, et al CED: credible early detection of social media rumors[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 33 (8): 3035- 3047
12 刘政, 卫志华, 张韧弦 基于卷积神经网络的谣言检测[J]. 计算机应用, 2017, 37 (11): 3053- 3056
LIU Zheng, WEI Zhi-hua, ZHANG Ren-xian Rumor detection based on convolutional neural network[J]. Journal of Computer Applications, 2017, 37 (11): 3053- 3056
13 胡斗, 卫玲蔚, 周薇, 等 一种基于多关系传播树的谣言检测方法[J]. 计算机研究与发展, 2021, 58 (7): 1395- 1411
HU Dou, WEI Ling-wei, ZHOU Wei, et al A rumor detection approach based on multi-relational propagation tree[J]. Journal of Computer Research and Development, 2021, 58 (7): 1395- 1411
doi: 10.7544/issn1000-1239.2021.20200810
14 WU Z, PI D, CHEN J, et al Rumor detection based on propagation graph neural network with attention mechanism[J]. Expert Systems with Applications, 2020, 158: 113595
doi: 10.1016/j.eswa.2020.113595
15 杨延杰, 王莉, 王宇航 融合源信息和门控图神经网络的谣言检测研究[J]. 计算机研究与发展, 2021, 58 (7): 1412- 1424
YANG Yan-jie, WANG Li, WANG Yu-hang Rumor detection based on source information and gating graph neural network[J]. Journal of Computer Research and Development, 2021, 58 (7): 1412- 1424
doi: 10.7544/issn1000-1239.2021.20200801
16 YANG X, LYU Y, TIAN T, et al. Rumor detection on social media with graph structured adversarial learning [C]// Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence. Montreal: IJCAI, 2021.
17 HU L, YANG T, SHI C, et al. Heterogeneous graph attention networks for semi-supervised short text classification [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: EMNLP-IJCNLP, 2019.
18 YAO L, MAO C, LUO Y. Graph convolutional networks for text classification [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New Orleans: AAAI, 2019.
19 LIU X, YOU X, ZHANG X, et al. Tensor graph convolutional networks for text classification [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020.
20 ZHANG Y, YU X, CUI Z, et al. Every document owns its structure: inductive text classification via graph neural networks [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.[s.l.]:ACL, 2020.
21 LI Y, TARLOW D, BROCKSCHMIDT M, et al. Gated graph sequence neural networks [C]// Proceedings of the 4th International Conference on Learning Representations. Puerto Rico: ICLR, 2016.
22 米源, 唐恒亮 基于图卷积网络的谣言鉴别研究[J]. 计算机工程与应用, 2021, 57 (13): 161- 167
MI Yuan, TANG Heng-liang Rumor identification research based on graph convolutional network[J]. Computer Engineering and Applications, 2021, 57 (13): 161- 167
doi: 10.3778/j.issn.1002-8331.2003-0357
23 KIM Y. Convolutional neural networks for sentence classification [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: EMNLP, 2014.
24 SHU K, CUI L, WANG S, et al. dEFEND: explainable fake news detection [C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Anchorage: KDD, 2019.
25 BIAN T, XIAO X, XU T, et al. Rumor detection on social media with bi-directional graph convolutional networks [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020.
[1] 鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[2] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.
[3] 许萌,王丹,李致远,陈远方. IncepA-EEGNet: 融合Inception网络和注意力机制的P300信号检测方法[J]. 浙江大学学报(工学版), 2022, 56(4): 745-753, 782.
[4] 柳长源,何先平,毕晓君. 融合注意力机制的高效率网络车型识别[J]. 浙江大学学报(工学版), 2022, 56(4): 775-782.
[5] 陈巧红,裴皓磊,孙麒. 基于视觉关系推理与上下文门控机制的图像描述[J]. 浙江大学学报(工学版), 2022, 56(3): 542-549.
[6] 王婷,朱小飞,唐顾. 基于知识增强的图卷积神经网络的文本分类[J]. 浙江大学学报(工学版), 2022, 56(2): 322-328.
[7] 农元君,王俊杰,陈红,孙文涵,耿慧,李书悦. 基于注意力机制和编码-解码架构的施工场景图像描述方法[J]. 浙江大学学报(工学版), 2022, 56(2): 236-244.
[8] 刘英莉,吴瑞刚,么长慧,沈韬. 铝硅合金实体关系抽取数据集的构建方法[J]. 浙江大学学报(工学版), 2022, 56(2): 245-253.
[9] 董红召,方浩杰,张楠. 旋转框定位的多尺度再生物品目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(1): 16-25.
[10] 王鑫,陈巧红,孙麒,贾宇波. 基于关系推理与门控机制的视觉问答方法[J]. 浙江大学学报(工学版), 2022, 56(1): 36-46.
[11] 陈智超,焦海宁,杨杰,曾华福. 基于改进MobileNet v2的垃圾图像分类算法[J]. 浙江大学学报(工学版), 2021, 55(8): 1490-1499.
[12] 雍子叶,郭继昌,李重仪. 融入注意力机制的弱监督水下图像增强算法[J]. 浙江大学学报(工学版), 2021, 55(3): 555-562.
[13] 陈涵娟,达飞鹏,盖绍彦. 基于竞争注意力融合的深度三维点云分类网络[J]. 浙江大学学报(工学版), 2021, 55(12): 2342-2351.
[14] 陈岳林,田文靖,蔡晓东,郑淑婷. 基于密集连接网络和多维特征融合的文本匹配模型[J]. 浙江大学学报(工学版), 2021, 55(12): 2352-2358.
[15] 辛文斌,郝惠敏,卜明龙,兰媛,黄家海,熊晓燕. 基于ShuffleNetv2-YOLOv3模型的静态手势实时识别方法[J]. 浙江大学学报(工学版), 2021, 55(10): 1815-1824.