Please wait a minute...
浙江大学学报(工学版)  2024, Vol. 58 Issue (10): 2040-2052    DOI: 10.3785/j.issn.1008-973X.2024.10.007
计算机与控制工程     
基于广度-深度采样和图卷积网络的谣言检测方法
王友卫1(),王炜琦1,凤丽洲2,朱建明1,李洋1
1. 中央财经大学 信息学院,北京 100081
2. 天津财经大学 统计学院,天津 300222
Rumor detection method based on breadth-depth sampling and graph convolutional networks
Youwei WANG1(),Weiqi WANG1,Lizhou FENG2,Jianming ZHU1,Yang LI1
1. School of Information, Central University of Finance and Economics, Beijing 100081, China
2. School of Statistics, Tianjin University of Finance and Economics, Tianjin 300222, China
 全文: PDF(1652 KB)   HTML
摘要:

现有谣言检测方法存在早期数据丢失、特征利用不充分问题,为此提出新的检测方法. 为了充分挖掘事件的早期传播特征,提出广度采样方法并构建与事件对应的传播序列,利用Transformer挖掘长距离评论间的语义相关性并构建事件的传播序列特征. 为了有效挖掘事件的传播结构特征,提出基于路径长度的深度采样方法,构建事件对应的信息传播子图和信息聚合子图,利用图卷积网络在挖掘图结构特征方面的优势,获得与事件对应的传播结构特征. 将事件对应的传播序列特征表示与传播结构特征表示进行拼接,得到事件对应的最终特征表示. 在公开数据集Weibo2016和CED上开展所提方法的有效性验证实验. 结果表明,所提方法普遍优于现有典型方法. 与基线方法相比,所提方法的准确率和F1值均有显著提升,所提方法在谣言检测领域的有效性得到验证.

关键词: 谣言检测图卷积网络广度采样深度采样注意力机制    
Abstract:

A new detection method was proposed to resolve the problems of early data loss and insufficient feature utilization in the field of rumor detection. In order to fully extract early propagation features of events, a breadth sampling method was proposed, and propagation sequences corresponding to events were constructed. A Transformer was utilized to explore semantic correlations between long-distance comments and to construct propagation sequence features for events. In order to effectively uncover the structural features of event propagation, a depth sampling method based on path length was proposed, and information propagation subgraphs and information aggregation subgraphs corresponding to events were constructed. The advantage of graph convolutional networks in exploring graph structural features was leveraged to obtain the propagation structure features corresponding to events. Feature representation of the propagation sequence and propagation structure for events were concatenated to obtain the ultimate feature representation. Validation experiments for the proposed method were conducted on two public datasets (Weibo2016 and CED). Results show that the proposed method is generally superior to existing typical methods. Compared to baseline methods, the proposed method has significant improvements in accuracy and F1 score, validating the effectiveness of the method in the field of rumor detection.

Key words: rumor detection    graph convolutional network    breadth sampling    depth sampling    attention mechanism
收稿日期: 2023-08-04 出版日期: 2024-09-27
CLC:  TP 393  
基金资助: 国家自然科学基金资助项目(61906220);国家社科基金资助项目(18CTJ008);教育部人文社科资助项目(19YJCZH178);中央财经大学新兴交叉学科建设项目;内蒙古纪检监察大数据实验室2020-2021年度开放课题资助项目(IMDBD202002, IMDBD202004).
作者简介: 王友卫(1987—),男,副教授,博士,从事机器学习、数据挖掘研究. orcid.org/0000-0002-3925-3422. E-mail:ywwang15@126.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
王友卫
王炜琦
凤丽洲
朱建明
李洋

引用本文:

王友卫,王炜琦,凤丽洲,朱建明,李洋. 基于广度-深度采样和图卷积网络的谣言检测方法[J]. 浙江大学学报(工学版), 2024, 58(10): 2040-2052.

Youwei WANG,Weiqi WANG,Lizhou FENG,Jianming ZHU,Yang LI. Rumor detection method based on breadth-depth sampling and graph convolutional networks. Journal of ZheJiang University (Engineering Science), 2024, 58(10): 2040-2052.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2024.10.007        https://www.zjujournals.com/eng/CN/Y2024/V58/I10/2040

图 1  Transformer编码块结构图
图 2  微博传播示意图
图 3  基于广度-深度采样和图卷积网络的谣言检测方法执行流程图
图 4  广度采样策略示意图
图 5  深度采样策略示意图
数据集帖子总数非谣言帖子数谣言帖子数用户数事件平均层数事件平均帖子数事件平均传播时长/h
Weibo20164 6642 3512 3132 746 8182.85431.271 811.4
CED3 38718491 5381 067 4101.73377.7411.34
表 1  2个公开数据集的参数
方法参数设置
DTR所选特征包括信息来源可信度、身份、多样性、地址,语言态度、传播特征等,使用信息增益选择特征
DTC所选特征包括消息内容、用户、主题、传播等,使用向前搜索选择特征
SVM-TS所选特征包括用户信息、内容、传播等,核函数为径向基函数(radial basis function,RBF)
GRU词汇表大小为5000,GRU层数为2,学习率为0.5
RvNN词汇表大小为5000,词嵌入向量维数为100
PLAN隐藏层向量维数为300,学习率为0.01,批处理大小为16
Bi-GCN隐藏层向量维数为64,丢弃率为0.5,边丢弃率为0.5,Epoch=200,早停次数为10
RDEA隐藏层向量维数为64,节点掩蔽率为0.2,边丢弃率为0.5
EBGCN隐藏层向量维数为64,学习率为0.0002,隐藏层向量维数为200
ACLR-BiGCN隐藏层向量维数为512,图卷积层数为2,学习率为0.0001,边丢弃率为0.2,批处理大小为64
UPSR隐藏层向量维数为64,学习率为0.001,Epoch=200
表 2  各基线方法的参数设置
WDN/%TAccF1
10.34547.910.9380.937
0.55250.320.9440.943
0.77256.300.9490.949
20.35853.930.9430.943
0.56558.760.9520.950
0.78476.300.9310.929
30.36664.270.9330.934
0.57789.440.9410.939
0.79499.230.9220.918
表 3  Weibo2016数据集的参数分析结果
WDN/%TAccF1
10.3445.640.9220.923
0.5546.190.9280.928
0.7828.760.9320.929
20.3626.260.9230.918
0.5718.540.9300.932
0.7969.280.9140.916
表 4  CED数据集的参数分析结果
方法类别方法名称Weibo2016CEDTwitter-COVID19
AccF1AccF1AccF1
基于传统机器学习DTR0.7320.7330.6720.6680.3770.329
DTC0.8310.8250.7400.7410.4920.426
SVM-TS0.8570.8590.7460.7560.5100.498
基于事件传播序列GRU0.8980.8990.8610.8640.4980.401
RvNN0.9080.9080.8920.8910.5400.391
PLAN0.9320.9360.9160.9130.5730.432
基于事件传播结构Bi-GCN0.9270.9280.8940.8980.6160.415
RDEA0.9210.9210.9100.9160.6380.504
EBGCN0.9370.9350.8800.8790.5890.563
ACLR-BiGCN0.9240.9220.8980.9030.7650.686
UPSR0.9340.9280.8960.8950.6020.587
BDS-GCN0.9440.9430.9280.9280.6820.674
表 5  不同谣言检测方法在3种数据集上的实验结果
图 6  谣言检测方法在2个数据集上的模块消融实验结果
图 7  Weibo2016数据集上不同谣言检测方法的性能指标对比
图 8  CED数据集上不同谣言检测方法的性能指标对比
图 9  Weibo2016数据集上不同方法的谣言事件早期检测结果
图 10  CED数据集上不同方法的谣言事件早期检测结果
1 MA J, GAO W, WONG K F. Detect rumors in microblog posts using propagation structure via kernel learning [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics . Vancouver: Association for Computational Linguistics, 2017: 708–717.
2 MA J, GAO W, WEI Z, et al. Detect rumors using time series of social context information on microblogging websites [C]// Proceedings of the 24th ACM International on Conference on Information and Knowledge Management . Melbourne: ACM, 2015: 1751–1754.
3 YANG F, LIU Y, YU X, et al. Automatic detection of rumor on sina weibo [C]// Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics . Beijing: ACM, 2012: 1–7.
4 ZHAO Z, RESNICK P, MEI Q. Enquiring minds: early detection of rumors in social media from enquiry posts [C]// Proceedings of the 24th International Conference on World Wide Web . Florence: [s. n.], 2015: 1395–1405.
5 CASTILLO C, MENDOZA M, POBLETE B. Information credibility on Twitter [C]// Proceedings of the 20th International Conference on World Wide Web . Hyderabad: ACM, 2011: 675–684.
6 REIS J C S, CORREIA A, MURAI F, et al Supervised learning for fake news detection[J]. IEEE Intelligent Systems, 2019, 34 (2): 76- 81
doi: 10.1109/MIS.2019.2899143
7 YANG R, ZHANG J, GAO X, et al. Simple and effective text matching with richer alignment features [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . Florence: Association for Computational Linguistics, 2019: 4699–4709.
8 KWON S, CHA M, JUNG K, et al. Prominent features of rumor propagation in online social media [C]// 2013 IEEE 13th International Conference on Data Mining . Dallas: IEEE, 2013: 1103–1108.
9 LIU X, NOURBAKHSH A, LI Q, et al. Real-time rumor debunking on Twitter [C]// Proceedings of the 24th ACM International on Conference on Information and Knowledge Management . Melbourne: ACM, 2015: 1867–1870.
10 YU F, LIU Q, WU S, et al. A convolutional approach for misinformation identification [C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence . Melbourne: AAAI Press, 2017: 3901–3907.
11 MA J, GAO W, MITRA P, et al. Detecting rumors from microblogs with recurrent neural networks [C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence . New York: AAAI Press, 2016: 3818–3824.
12 WU K, YANG S, ZHU K Q. False rumors detection on Sina Weibo by propagation structures [C]// 2015 IEEE 31st International Conference on Data Engineering . Seoul: IEEE, 2015: 651–662.
13 VEDOVA M L D, TACCHINI E, MORET S, et al. Automatic online fake news detection combining content and social signals [C]// 2018 22nd Conference of Open Innovations Association (FRUCT) . Jyvaskyla: IEEE, 2018: 272–279.
14 MA J, GAO W, WONG K F. Rumor detection on Twitter with tree-structured recursive neural networks [C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics . Melbourne: Association for Computational Linguistics, 2018: 1980–1989.
15 KUMAR S, CARLEY K. Tree LSTMs with convolution units to predict stance and rumor veracity in social media conversations [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . Florence: Association for Computational Linguistics, 2019: 5047–5058.
16 LAO A, SHI C, YANG Y. Rumor detection with field of linear and non-linear propagation [C]// Proceedings of the Web Conference 2021 . Ljubljana: ACM, 2021: 3178–3187.
17 VOSOUGHI S, ROY D, ARAL S The spread of true and false news online[J]. Science, 2018, 359 (6380): 1146- 1151
doi: 10.1126/science.aap9559
18 BIAN T, XIAO X, XU T, et al. Rumor detection on social media with bi-directional graph convolutional networks [C]// Proceedings of the AAAI Conference on Artificial Intelligence . [S. l.]: AAAI Press, 2020, 34(1): 549–556.
19 HUANG Q, YU J, WU J, et al. Heterogeneous graph attention networks for early detection of rumors on Twitter [C]// 2020 International Joint Conference on Neural Networks . Glasgow: IEEE, 2020.
20 杨延杰, 王莉, 王宇航 融合源信息和门控图神经网络的谣言检测研究[J]. 计算机研究与发展, 2021, 58 (7): 1412- 1424
YANG Yanjie, WANG Li, WANG Yuhang Rumor detection based on source information and gating graph neural network[J]. Journal of Computer Research and Development, 2021, 58 (7): 1412- 1424
21 HE Z, LI C, ZHOU F, et al. Rumor detection on social media with event augmentations [C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval . [S.l.]: ACM, 2021: 2020–2024.
22 LIN H, MA J, CHEN L, et al. Detect rumors in microblog posts for low-resource domains via adversarial contrastive learning [C]// Findings of the Association for Computational Linguistics: NAACL 2022 . Seattle: Association for Computational Linguistics, 2022: 2543–2556.
23 WEI L, HU D, ZHOU W, et al. Uncertainty-aware propagation structure reconstruction for fake news detection [C]// Proceedings of the 29th International Conference on Computational Linguistics . Gyeongju: International Committee on Computation Linguistics, 2022: 2759–2768.
24 KHOO L M S, CHIEU H L, QIAN Z, et al. Interpretable rumor detection in microblogs by attending to user interactions [C]// Proceedings of the AAAI Conference on Artificial Intelligence . [S. l.]: AAAI Press, 2020: 8783–8790.
25 KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [EB/OL]. (2017–02–22)[2023–07–19]. https://arxiv.org/pdf/1609.02907.
26 NIKOLENTZOS G, TIXIER A, VAZIRGIANNIS M. Message passing attention networks for document understanding [C]// Proceedings of the AAAI Conference on Artificial Intelligence . [S. l.]: AAAI Press, 2020: 8544–8551.
27 WEI L, HU D, ZHOU W, et al. Towards propagation uncertainty: edge-enhanced bayesian graph convolutional networks for rumor detection [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . [S.l.]: Association for Computational Linguistics, 2021: 3845–3854.
28 LI S, ZHAO Z, HU R, et al. Analogical reasoning on Chinese morphological and semantic relations [C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics . Melbourne: Association for Computational Linguistics, 2018: 138–143.
29 HUANG Q, ZHOU C, WU J, et al. Deep structure learning for rumor detection on Twitter [C]// 2019 International Joint Conference on Neural Networks . Budapest: IEEE, 2019: 1–8.
30 KINGMA D P, BA J. Adam: a method for stochastic optimization [EB/OL]. (2017–01–30)[2023–03–29]. https://arxiv.org/pdf/1412.6980.
[1] 陈珂,张文浩. 基于对比学习的零样本对象谣言检测[J]. 浙江大学学报(工学版), 2024, 58(9): 1790-1800.
[2] 李灿林,王新玥,马利庄,邵志文,张文娇. 融合注意力机制和结构线提取的图像卡通化[J]. 浙江大学学报(工学版), 2024, 58(8): 1728-1737.
[3] 李忠良,陈麒,石琳,杨朝,邹先明. 时间感知组合的动态知识图谱补全[J]. 浙江大学学报(工学版), 2024, 58(8): 1738-1747.
[4] 吴书晗,王丹,陈远方,贾子钰,张越棋,许萌. 融合注意力的滤波器组双视图图卷积运动想象脑电分类[J]. 浙江大学学报(工学版), 2024, 58(7): 1326-1335.
[5] 马现伟,范朝辉,聂为之,李东,朱逸群. 对失效传感器具备鲁棒性的故障诊断方法[J]. 浙江大学学报(工学版), 2024, 58(7): 1488-1497.
[6] 杨军,张琛. 基于边界点估计与稀疏卷积神经网络的三维点云语义分割[J]. 浙江大学学报(工学版), 2024, 58(6): 1121-1132.
[7] 李运堂,李恒杰,张坤,王斌锐,关山越,陈源. 基于新型编码解码网络的复杂输电线识别[J]. 浙江大学学报(工学版), 2024, 58(6): 1133-1141.
[8] 邢志伟,朱书杰,李彪. 基于改进图卷积神经网络的航空行李特征感知[J]. 浙江大学学报(工学版), 2024, 58(5): 941-950.
[9] 刘毅,陈一丹,高琳,洪姣. 基于多尺度特征融合的轻量化道路提取模型[J]. 浙江大学学报(工学版), 2024, 58(5): 951-959.
[10] 魏翠婷,赵唯坚,孙博超,刘芸怡. 基于改进Mask R-CNN与双目视觉的智能配筋检测[J]. 浙江大学学报(工学版), 2024, 58(5): 1009-1019.
[11] 何勇禧,韩虎,孔博. 基于多依赖图和知识融合的方面级情感分析模型[J]. 浙江大学学报(工学版), 2024, 58(4): 737-747.
[12] 宦海,盛宇,顾晨曦. 基于遥感图像道路提取的全局指导多特征融合网络[J]. 浙江大学学报(工学版), 2024, 58(4): 696-707.
[13] 宋明俊,严文,邓益昭,张俊然,涂海燕. 轻量化机器人抓取位姿实时检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 599-610.
[14] 姚鑫骅,于涛,封森文,马梓健,栾丛丛,沈洪垚. 基于图神经网络的零件机加工特征识别方法[J]. 浙江大学学报(工学版), 2024, 58(2): 349-359.
[15] 周雕,熊馨,周建华,宗静,张琪. 卷积神经网络结合子域适应的低采样率肌电手势识别[J]. 浙江大学学报(工学版), 2024, 58(10): 2011-2019.