Please wait a minute...
浙江大学学报(工学版)  2026, Vol. 60 Issue (3): 527-535    DOI: 10.3785/j.issn.1008-973X.2026.03.008
计算机技术、控制工程     
基于分阶段语义感知的事件抽取大语言模型框架
李延松1(),陈宁2,刘锋光2,陈盼2,黄晓峰1,葛慧丽2,*()
1. 杭州电子科技大学 通信工程学院,浙江 杭州 310018
2. 浙江省科技项目管理服务中心,浙江 杭州 310006
Large language model framework for event extraction based on staged semantic perception
Yansong LI1(),Ning CHEN2,Fengguang LIU2,Pan CHEN2,Xiaofeng HUANG1,Huili GE2,*()
1. School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China
2. Department of Science and Technology of Zhejiang Province, Hangzhou 310006, China
 全文: PDF(1245 KB)   HTML
摘要:

针对大语言模型难以对事件中层级化语义建模的问题,提出基于分阶段语义感知的事件抽取大模型框架,整个框架模拟了人类“先识整体、再学细节”的认知机理. 结构化统一编码设计了不同领域的统一的提示词. 即插即用的语义感知驱动单元支持事件类型预测和论元提取的分阶段学习,通过自适应权重分配机制使大模型关注细颗粒度的语义信息. 为了提升模型的泛化能力,提出基于事件分解的数据增强来丰富训练数据. 在CASIE和ACE2005数据集上进行的实验结果表明,该方法在事件抽取中的性能取得显著提升.

关键词: 自然语言处理事件抽取大语言模型监督微调数据增强    
Abstract:

A large language model framework for event extraction based on staged semantic perception was proposed aiming at the difficulty in modeling the hierarchical semantics of events, which simulated the human cognitive mechanism of ‘recognizing the whole first and then learning the details’. The structured unified coding ensured consistency of prompts across different domains. The plug-and-play semantic perception driver unit supported staged learning for event-type prediction and argument extraction. The model focused on fine-grained semantic information by leveraging an adaptive weight mechanism. Data augmentation based on event decomposition was proposed to enrich the training data in order to enhance the generalization ability of the model. The experimental results on the CASIE and ACE2005 datasets demonstrated that our method significantly improved the performance of models in the event extraction.

Key words: natural language processing    event extraction    large language model    supervised fine-tuning    data augmentation
收稿日期: 2025-03-11 出版日期: 2026-02-04
:  TP 319  
基金资助: 国家重点研发计划资助项目(2024YFB3312600);浙江省“领雁”研发攻关计划资助项目(2024C01107).
通讯作者: 葛慧丽     E-mail: yansongli@hdu.edu.cn;429362862@qq.com
作者简介: 李延松(1999—),男,硕士生,从事大模型信息抽取、图像压缩的研究. orcid.org/0009-0004-1556-9854. E-mail:yansongli@hdu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
李延松
陈宁
刘锋光
陈盼
黄晓峰
葛慧丽

引用本文:

李延松,陈宁,刘锋光,陈盼,黄晓峰,葛慧丽. 基于分阶段语义感知的事件抽取大语言模型框架[J]. 浙江大学学报(工学版), 2026, 60(3): 527-535.

Yansong LI,Ning CHEN,Fengguang LIU,Pan CHEN,Xiaofeng HUANG,Huili GE. Large language model framework for event extraction based on staged semantic perception. Journal of ZheJiang University (Engineering Science), 2026, 60(3): 527-535.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.03.008        https://www.zjujournals.com/eng/CN/Y2026/V60/I3/527

图 1  基于分阶段语义感知的事件抽取大语言模型框架
图 2  结构化统一编码的实例
图 3  传统seq2seq任务和语义感知驱动模块的LLMs优化目标对比
图 4  语义感知驱动模块的工作原理
图 5  基于事件分解的数据增强的举例
图 6  微调与预测
数据集主题事件类型数/
角色类型数
训练
集数
验证
集数
测试
集数
ACE05通用新闻33/223342327293
CASIE网络安全5/2637517881500
表 1  数据集的统计
方法主干网络模型参数规模CASIEACE05
F1e/%F1a/%F1e/%F1a/%
Bert-baseBERT-base110×10668.9860.3772.5059.50
EEQA2×BERT-base220×10672.4053.30
UIET5-v1.1-base770×10669.3361.3073.3654.79
USMRoBERTa-Large355×10671.7363.2672.4155.83
InstructUIEFlanT5-11B11×10967.8063.5377.1372.94
Schema-Aware-EEChatGPT > 20×10970.2856.2873.6849.56
LC4EEGPT-4> 175×10977.2054.90
TALOR-EEGPT-3.5-turbo> 20×10970.5047.70
本文方法ChatGLM36×10990.9363.7176.5672.07
本文方法GLM4-9B-04149×10993.4066.8777.8176.52
表 2  在ACE05和CASIE数据集上的模型性能对比
图 7  CASIE数据集的问题
方法CASIEACE05
$F_{{\mathrm{1e}}} $/%$F_{{\mathrm{1a}}} $/%$F_{{\mathrm{1e}}} $/%$F_{{\mathrm{1a}}} $/%
ChatGLM389.3462.5874.2870.77
ChatGLM3+语义感知驱动90.5862.8874.8571.24
GLM4-9B-041491.8665.2875.7673.57
GLM4-9B-0414+语义感知驱动92.7065.8576.7275.03
表 3  语义感知驱动单元的消融实验
方法CASIEACE05
F1eF1aF1eF1a
ChatGLM389.3462.5874.2870.77
ChatGLM3+事件分解89.4662.2575.1971.67
GLM4-9B-041491.8665.2875.7673.57
GLM4-9B-0414+事件分解92.5065.4076.7074.20
表 4  基于事件分解的数据增强的消融实验
方法5-shotACE05_CASIECASIE_ACE05
F1eF1aF1eF1a
ChatGLM3×61.7340.6540.7239.59
ChatGLM363.9343.2642.8341.21
基于本文方法的ChatGLM3×47.5742.4541.6339.68
基于本文方法的ChatGLM368.2148.2547.5241.70
GLM4-9B-0414×70.0644.0644.7042.19
GLM4-9B-041473.6646.9046.7544.15
基于本文方法的GLM4-9B-0414×78.2643.3146.7543.30
基于本文方法的GLM4-9B-041479.8545.7852.2145.07
表 5  跨域泛化能力的实验
未开机理由事件描述
仪器搬迁实验室改造该设备于2023年6月进行搬迁后,因实验室
架构调整和实验室场地改造的
原因停用,尚未恢复使用.
传感器问题排查设备正常使用中,但因设备进行电路改造升级
为三相电源,原传感器已不适用,实验室
没有及时告知更换.
仪器待报废设备仪器陈旧,技术指标落后.
表 6  平台设备长时间未开机原因的举例
13 NGUYEN T H, GRISHMAN R. Event detection and domain adaptation with convolutional neural networks [C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing: ACL, 2015: 365–371.
14 YAO L, MAO C, LUO Y. Graph convolutional networks for text classification [C]//Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu: AAAI, 2019: 7370-7377.
15 ZHU M, ZENG K, JIBINGWU J, et al. LC4EE: LLMs as good corrector for event extraction [C]//Proceedings of the Findings of the Association for Computational Linguistics. St. Julian's, Malta: ACL, 2024: 12028–12038.
16 TSUJIMURA T, YAMADA K, IDA R, et al Contextualized medication event extraction with striding NER and multi-turn QA[J]. Journal of Biomedical Informatics, 2023, 144: 104416
doi: 10.1016/j.jbi.2023.104416
17 LU Y, LIU Q, DAI D, et al. Unified structure generation for universal information extraction [C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin: ACL, 2022: 5755–5772.
18 WANG X, ZHOU W, ZU C, et al. Instructuie: multi-task instruction tuning for unified information extraction [EB/OL]. (2023-04-17) [2024-12-20]. https://arxiv.org/abs/2304.08085.
19 GAO J, ZHAO H, YU C, et al. Exploring the feasibility of chatgpt for event extraction [EB/OL]. (2023-03-09) [2024-12-20]. https://arxiv.org/abs/2303.03836.
20 BONISOLI G, VILARES D, ROLLO F, et al Document-level event extraction from Italian crime news using minimal data[J]. Knowledge-Based Systems, 2025, 317: 113386
doi: 10.1016/j.knosys.2025.113386
21 LI Z, ZENG Y, ZUO Y, et al. KnowCoder: coding structured knowledge into LLMs for universal information extraction [C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. Vancouver: ACL, 2024: 8758–8779.
22 LOU J, LU Y, DAI D, et al. Universal information extraction as unified semantic matching [C]// Proceedings of the AAAI Conference on Artificial Intelligence. [S. l. ]: AAAI, 2023: 13318–13326.
23 HU E J, SHEN Y, WALLIS P, et al. Lora: low-rank adaptation of large language models [EB/OL]. [2025-02-25]. https://arxiv.org/abs/2106.09685.
24 LIU S, LI Y, ZHANG F, et al. Event detection without triggers [C]//Proceedings of the 2019 Conference of the North. Minneapolis: ACL, 2019: 735–744.
25 DEVLIN J, CHANG M W, LEE K, et al. Bert: pre-training of deep bidirectional transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Minneapolis: ACL, 2019: 4171–4186.
1 AHN D. The stages of event extraction [C]//Proceedings of the Workshop on Annotating and Reasoning about Time and Events. Sydney: ACL, 2006: 1–8.
2 DODDINGTON G R, MITCHELL A, PRZYBOCKI M A, et al. The automatic content extraction (ACE) program: tasks, data, and evaluation [C]//Proceedings of the International Conference on Language Resources and Evaluation. Lisbon: ELRA, 2004: 837–840.
3 ZHANG W, ZHAO X, ZHAO L, et al. DRL4IR: 2nd workshop on deep reinforcement learning for information retrieval [C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. [S. l. ]: ACM, 2021: 2681–2684.
4 BOSSELUT A, LE BRAS R, CHOI Y. Dynamic neuro-symbolic knowledge graph construction for zero-shot commonsense question answering [C]// Proceedings of the AAAI Conference on Artificial Intelligence. [S. l. ]: AAAI, 2021: 4923–4931.
5 CAO Q, TRIVEDI H, BALASUBRAMANIAN A, et al. DeFormer: decomposing pre-trained Transformers for faster question answering [C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Seattle: ACL, 2020: 4487–4497.
6 蒋倩, 唐昊冶, 涂勇辉, 等 大型仪器设备分层化与特征化运行管理探索[J]. 实验技术与管理, 2024, 41 (10): 266- 270
JIANG Qian, TANG Haoye, TU Yonghui, et al Exploration of hierarchical and characteristic operation modes for large instruments and equipment[J]. Experimental Technology and Management, 2024, 41 (10): 266- 270
7 张可, 万红, 张肖笑, 等 省属高校分析测试中心大型仪器设备开放运行管理探讨[J]. 实验与分析, 2024, (4): 87- 90
ZHANG Ke, WAN Hong, ZHANG Xiaoxiao, et al Discussion on the open operation management of large instruments in the analysis and testing center of provincial universities[J]. Labor Praxis, 2024, (4): 87- 90
doi: 10.20175/j.syyfx.20240414
8 CHEN Y, XU L, LIU K, et al. Event extraction via dynamic multi-pooling convolutional neural networks [C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing: ACL, 2015: 167–176.
9 SUBBURATHINAM A, LU D, JI H, et al. Cross-lingual structure transfer for relation and event extraction [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: ACL, 2019: 313–325.
10 ZHANG J, QIN Y, ZHANG Y, et al. Extracting entities and events as a single task using a transition-based neural model [C]// 28th International Joint Conference on Artificial Intelligence. Macau: MKP, 2019: 5422–5428.
11 LI Q, LI J, SHENG J, et al A survey on deep learning event extraction: Approaches and applications[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 35 (5): 6301- 6321
12 项威, 王邦 中文事件抽取研究综述[J]. 计算机技术与发展, 2020, 30 (2): 1- 6
XIANG Wei, WANG Bang Survey of chinese event extraction research[J]. Computer Technology and Development, 2020, 30 (2): 1- 6
doi: 10.3778/j.issn.1002-8331.2203-0453
26 DU X, CARDIE C. Event extraction by answering (almost) natural questions [C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Punta Cana: ACL, 2020: 671–683.
27 SHIRI F, MOGHIMIFAR F, HAFFARI R, et al. Decompose, enrich, and extract! schema-aware event extraction using LLMs [C]//27th International Conference on Information Fusion. Florence: IEEE, 2024: 1–8.
28 WANG S, HUANG L. Targeted augmentation for low-resource event extraction [C]// Findings of the Association for Computational Linguistics: NAACL 2024. Mexico City: ACL, 2024: 4414–4428.
[1] 黄孝喜,查正超,陆诗佳. 基于大语言模型的中文隐喻多维度评估[J]. 浙江大学学报(工学版), 2026, 60(2): 388-395.
[2] 谢涛,葛慧丽,陈宁,汪晓锋,李延松,黄晓峰. 知识嵌入增强的对比推荐模型[J]. 浙江大学学报(工学版), 2026, 60(1): 90-98.
[3] 竹志超,李建强,齐宏智,赵青,高齐,李思颖,蔡嘉怡,沈金炎. 大模型知识引导的复合多注意力文档级关系抽取方法[J]. 浙江大学学报(工学版), 2025, 59(9): 1793-1802.
[4] 杨磊,何鹏举,丑幸幸. 基于TimeGAN数据增强的复杂过程故障分类方法[J]. 浙江大学学报(工学版), 2024, 58(9): 1768-1780.
[5] 尹雅博,朱小飞,刘议丹. 源域数据增强与多兴趣细化迁移的跨域推荐模型[J]. 浙江大学学报(工学版), 2024, 58(8): 1717-1727.
[6] 刘议丹,朱小飞,尹雅博. 基于异质图卷积神经网络的论点对抽取模型[J]. 浙江大学学报(工学版), 2024, 58(5): 900-907.
[7] 赵蕴龙,赵敏喆,朱文强,查星宇. 基于轻量化迁移学习的云边协同自然语言处理方法[J]. 浙江大学学报(工学版), 2024, 58(12): 2531-2539.
[8] 杨豚,郭永存,王爽,马鑫. 煤矿井下无人驾驶轨道电机车障碍物识别[J]. 浙江大学学报(工学版), 2024, 58(1): 29-39.
[9] 金鑫,庄建军,徐子恒. 轻量化YOLOv5s网络车底危险物识别算法[J]. 浙江大学学报(工学版), 2023, 57(8): 1516-1526.
[10] 艾青林,崔景瑞,吕冰海,童桐. 基于损伤区域融合变换的轴承鼓形滚子表面损伤检测方法[J]. 浙江大学学报(工学版), 2023, 57(5): 1009-1020.
[11] 程艳芬,吴家俊,何凡. 基于关系门控图卷积网络的方面级情感分析[J]. 浙江大学学报(工学版), 2023, 57(3): 437-445.
[12] 艾青林,杨佳豪,崔景瑞. 基于自适应增殖数据增强与全局特征融合的小目标行人检测[J]. 浙江大学学报(工学版), 2023, 57(10): 1933-1944.
[13] 詹燕,胡蝶,汤洪涛,鲁建厦,谭健,刘长睿. 基于改进生成对抗网络的图像数据增强方法[J]. 浙江大学学报(工学版), 2023, 57(10): 1998-2010.
[14] 王婷,朱小飞,唐顾. 基于知识增强的图卷积神经网络的文本分类[J]. 浙江大学学报(工学版), 2022, 56(2): 322-328.
[15] 刘坤,文熙,黄闽茗,杨欣欣,毛经坤. 基于生成对抗网络的太阳能电池缺陷增强方法[J]. 浙江大学学报(工学版), 2020, 54(4): 684-693.