Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2024, Vol. 58 Issue (7): 1357-1365    DOI: 10.3785/j.issn.1008-973X.2024.07.005
    
Fact-based similar case retrieval methods based on statutory knowledge
Linrui LI1(),Dongsheng WANG2,*(),Hongjie FAN2
1. Guanghua Law School, Zhejiang University, Hangzhou 310008, China
2. School of Information Management for Law, China University of Political Science and Law, Beijing 102249, China
Download: HTML     PDF(814KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

Existing research on the retrieval task of similar cases ignores the legal logic that the model should imply, and cannot adapt to the requirements of case similarity criteria in practical applications. Few datasets in Chinese for case retrieval tasks are difficult to meet the research needs. A similar case retrieval model was proposed based on legal logic and strong interpretability, and a case event logic graph was constructed based on predicate verbs. The statutory knowledge corresponding to various crimes was integrated into the proposed model, and the extracted elements were input to a neural network-based scorer to realize the task of case retrieval accurately and efficiently. A Confusing-LeCaRD dataset was built for the case retrieval task with a confusing group of charges as the main retrieval causes. Experiments show that the normalized discounted cumulative gain of the proposed model on the LeCaRD dataset and Confusing-LeCaRD dataset was 90.95% and 94.64%, and the model was superior to TF-IDF, BM25 and BERT-PLI in all indicators.



Key wordssimilar case retrieval      statutory knowledge      legal logic      event logic graph      deep learning     
Received: 21 June 2023      Published: 01 July 2024
CLC:  TP 391:TP 181  
Fund:  长沙市科技重大专项项目(kh2202006);中国政法大学科研创新项目(24KYGH013);中央高校基本科研业务费专项资金.
Corresponding Authors: Dongsheng WANG     E-mail: lilinrui1412@163.com;wangdsh@cupl.edu.cn
Cite this article:

Linrui LI,Dongsheng WANG,Hongjie FAN. Fact-based similar case retrieval methods based on statutory knowledge. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1357-1365.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2024.07.005     OR     https://www.zjujournals.com/eng/Y2024/V58/I7/1357


基于法条知识的事理型类案检索方法

现有类案检索研究忽略了模型应当蕴含的法律逻辑,无法适应实际应用中案件相似标准的要求;类案检索任务的中文数据集较少,难以满足研究需求现状. 为此提出基于法律逻辑、有较强可解释性的类案检索模型,构建以谓语动词为基础的案件事理图谱. 将各类罪名对应的法条知识融入所提模型,将提取的不同要素输入以神经网络为基础的评分器以实现准确、高效的类案检索. 构建针对类案检索任务、以易混淆罪名组为主要检索案由的Confusing-LeCaRD数据集,所提模型在LeCaRD数据集和Confusing-LeCaRD数据集上的归一化折损累计增益分别为90.95%和94.64%,在各项指标上均优于TF-IDF、BM25和BERT-PLI模型.


关键词: 类案检索,  法条知识,  法律逻辑,  事理图谱,  深度学习 
Fig.1 Overall architecture of similar case retrieval model
罪名要件式定义
虐待部属滥用职权,虐待部署,致人重伤,造成其他严重后果
妨碍安全驾驶对行驶中的交通工具的驾驶人员使用暴力,抢控驾驶操纵装置,干扰公共交通工具正常行驶,危及公共安全
重大责任事故违反有关安全管理的规定,发生重大伤亡事故,造成其他严重后果
Tab.1 Some provisions of repository of statutory knowledge
Fig.2 Construction process diagram of event logic graph of case
Fig.3 Vectorization processing and scorer
要件事实案情事实相似度
相似相似3
相似不相似2
不相似相似1
不相似不相似0
Tab.2 Labeling rules for LeCaRD dataset
罪名要件
危险驾驶在道路上驾驶机动车追逐竞驶,醉酒驾驶机动车,从事校车业务或者旅客运输严重超过定额乘员载客,从事校车业务或者旅客运输严重超过规定时速行驶,违反化学品安全管理规定运输危险化学品
交通肇事违反交通运输管理法规,致人重伤,致人死亡,造成重大公私财产损失
Tab.3 Element definitions of dangerous driving crime and traffic causing accident crime
编号罪名案情
1诈骗2009年3月20日,被告甲为实施诈骗活动,通过中介注册成立了一家公司,并通过网络招聘了五名员工······
9抢劫2021年5月2日,被告甲见乙作为老年人独自一人行走在路上,便冲过去夺走乙手中的布包······
Tab.4 Data structure of retrieved case in Confusing-LeCaRD dataset
编号罪名案情相似度等级
1307抢劫2018年8月27日,甲骑着电动车在路上行驶,
被告乙驾驶摩托车从旁快速经过,乙车后座
上的被告丙夺走甲的手提包······
3
2167盗窃2019年9月25日,被告甲见乙房屋门未关,便
偷偷潜入乙家中窃取一部手机和两百元现金······
0
Tab.5 Data structure of candidate case in Confusing-LeCaRD dataset
参数数值参数数值
Learning_rate2×10?4Max_len192
Batch_size1Hidden_size128
Weight_decay0.005Key_fact_threshold0.15
Tab.6 Model training parameters
模型NDCG@5NDCG@10NDCG@20NDCG@30
TF-IDF67.2373.4678.3683.40
BM2572.0873.8481.9787.41
BERT-PLI83.1985.6691.0191.17
本研究92.0494.6492.6091.51
Tab.7 Evaluation results of different models based on Confusing-LeCaRD dataset
消除的
模块
NDCG@5NDCG@10NDCG@20NDCG@30
法条知识88.00
(↓3.64)
90.34
(↓4.30)
88.22
(↓4.38)
89.82
(↓1.69)
事理图谱88.53
(↓3.51)
89.08
(↓5.56)
87.97
(↓4.63)
88.26
(↓3.25)
Tab.8 Ablation experiment results based on Confusing-LeCaRD dataset
案件编号罪名要件事实基本事实
+17011交通肇事{死亡,造成肺挫伤,受伤,相撞,休克}驾驶→倒车→相撞→受伤→抢救无效→死亡→驾驶→未注意瞭望→
造成受伤→导致休克→经抢救无效→死亡→达成赔偿
+17189交通肇事{损伤,受伤,死亡,肇事,有死亡,损伤,
安全法违反}
驾驶→行驶→适逢→驾驶→行驶→相撞→受伤→拨打→送往救治→
肇事→治疗→死亡→损伤→死亡→达成赔偿→接受赔偿
+717盗窃{财物盗窃,占有,盗得,实施盗窃,入户}乘坐→驾驶→到达→盗窃→进入→盗得→进入→盗得→占有→入户
盗窃→追缴违法所得
Tab.9 Characteristic information for case to be retrieved and candidate cases
[1]   BHATTACHARYA P, GHOSH K, PAL A, et al. Methods for computing legal document similarity: a comparative study [EB/OL]. (2020-04-26)[2023-08-10]. https://arxiv.org/pdf/2004.12307.
[2]   WAGH R S, ANAND D Legal document similarity: a multi-criteria decision-making perspective[J]. PeerJ Computer Science, 2020, 6: e262
doi: 10.7717/peerj-cs.262
[3]   TRAN V, NGUYEN M L, SATOH K. Building legal case retrieval systems with lexical matching and summarization using a pre-trained phrase scoring model [C]// Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law . [S.l.]: Association for Computing Machinery, 2019: 275–282.
[4]   JIANG J Y, ZHANG M, LI C, et al. Semantic text matching for long-form documents [C]// The World Wide Web Conference . [S.l.]: Association for Computing Machinery, 2019: 795–806.
[5]   SHAO Y, MAO J, LIU Y, et al. BERT-PLI: modeling paragraph-level interactions for legal case retrieval [C]// International Joint Conference on Artificial Intelligence . Yokohama: [s.n.], 2020: 3501–3507.
[6]   ALI B, MORE R, PAWAR S, et al. Prior case retrieval using evidence extraction from court judgements [C]// The Fifth Workshop on Automated Semantic Analysis of Information in Legal Text . São Paulo: [s.n.], 2021: 1–11.
[7]   MA Y, SHAO Y, WU Y, et al. LeCaRD: a legal case retrieval dataset for Chinese law system [C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval . [S.l.]: Association for Computing Machinery, 2021: 2342–2348.
[8]   YOSHIOKA M, KANO Y, KIYOTA N, et al. Overview of japanese statute law retrieval and entailment task at COLIEE-2018[C]// Proceedings of the Twelfth International Workshop on Juris-Informatics . Yokohama: [s.n.], 2018: 117–128
[9]   赵京胜, 宋梦雪, 高祥, 等 自然语言处理中的文本表示研究[J]. 软件学报, 2022, 33 (1): 102- 128
ZHAO Jingsheng, SONG Mengxue, GAO Xiang, et al Research on text representation in natural language processing[J]. Journal of Software, 2022, 33 (1): 102- 128
[10]   WEI L, ZHOU C, SU R, et al PEPred-Suite: improved and robust prediction of therapeutic peptides using adaptive feature representation learning[J]. Bioinformatics, 2019, 35 (21): 4272- 4280
doi: 10.1093/bioinformatics/btz246
[11]   LEE H Y, HUANG J B, SINGH M, et al. Unsupervised representation learning by sorting sequences [C]// Proceedings of the IEEE International Conference on Computer Vision . Venice: IEEE, 2017: 667–676.
[12]   SUK H I, SHEN D. Deep learning-based feature representation for AD/MCI classification [C]// International Conference on Medical Image Computing and Computer Assisted Intervention . [S.l.]: Springer. 2013: 583–590.
[13]   李松, 舒世泰, 郝晓红, 等 融合文本描述和层次类型的知识表示学习方法[J]. 浙江大学学报: 工学版, 2023, 57 (5): 911- 920
LI Song, SHU Shitai, HAO Xiaohong, et al Knowledge representation learning method integrating textual description and hierarchical type[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (5): 911- 920
[14]   ZHONG H, ZHOU J, QU W, et al. An element-aware multi-representation model for law article prediction [C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing . [S.l.]: Association for Computational Linguistics, 2020: 6663–6668.
[15]   LI L, SHI X, DING Y, et al. Event logic graph construction for event mining [C]// Journal of Physics: Conference Series . Beijing: IOP, 2021, 2037: 012135.
[16]   DING X, LI Z, LIU T, et al. ELG: an event logic graph [EB/OL]. (2019-08-07)[2023-08-10]. https://arxiv.org/pdf/1907.08015.
[17]   DU L, DING X, XIONG K, et al. ExCAR: event graph knowledge enhanced explainable causal reasoning [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing . [S.l.]: Association for Computational Linguistics, 2021: 2354–2363.
[18]   WANG Z. Legal element-oriented modeling with multi-view contrastive learning for legal case retrieval [C]// International Joint Conference on Neural Networks . Padua: IEEE, 2022: 1–10.
[19]   SHAO Y, WU Y, LIU Y, et al Understanding relevance judgments in legal case retrieval[J]. ACM Transactions on Information Systems, 2023, 41 (3): 1- 32
[20]   HU W, ZHAO S, ZHAO Q, et al BERT_LF: a similar case retrieval method based on legal facts[J]. Wireless Communications and Mobile Computing, 2022, 2022: 1- 9
[21]   NIGAM S K, GOEL N, BHATTACHARYA A. Nigam@COLIEE-22: legal case retrieval and entailment using cascading of lexical and semantic-based models [C]// New Frontiers in Artificial Intelligence . [S.l.]: Springer, 2023: 96–108.
[22]   TAN M, JIANG J, DAI B T A BERT-based two-stage model for Chinese chengyu recommendation[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2021, 20 (6): 1- 18
[1] Shuhan WU,Dan WANG,Yuanfang CHEN,Ziyu JIA,Yueqi ZHANG,Meng XU. Attention-fused filter bank dual-view graph convolution motor imagery EEG classification[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1326-1335.
[2] Juan SONG,Longxi HE,Huiping LONG. Deep learning-based algorithm for multi defect detection in tunnel lining[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(6): 1161-1173.
[3] Cuiting WEI,Weijian ZHAO,Bochao SUN,Yunyi LIU. Intelligent rebar inspection based on improved Mask R-CNN and stereo vision[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(5): 1009-1019.
[4] Bo ZHONG,Pengfei WANG,Yiqiao WANG,Xiaoling WANG. Survey of deep learning based EEG data analysis technology[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(5): 879-890.
[5] Hai HUAN,Yu SHENG,Chenxi GU. Global guidance multi-feature fusion network based on remote sensing image road extraction[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 696-707.
[6] Xianglong LUO,Yafei WANG,Yanbo WANG,Lixin WANG. Structural deformation prediction of monitoring data based on bi-directional gate board learning system[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 729-736.
[7] Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU. Light-weight algorithm for real-time robotic grasp detection[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 599-610.
[8] Qingjie QIAN,Junhe YU,Hongfei ZHAN,Rui WANG,Jian HU. Dimension prediction method of injection molded parts based on multi-feature fusion of DL-BiGRU[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 646-654.
[9] Xinhua YAO,Tao YU,Senwen FENG,Zijian MA,Congcong LUAN,Hongyao SHEN. Recognition method of parts machining features based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 349-359.
[10] Xuefei SUN,Ruifeng ZHANG,Xin GUAN,Qiang LI. Lightweight and efficient human pose estimation with enhanced priori skeleton structure[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(1): 50-60.
[11] Chao-hao ZHENG,Zhi-wei YIN,Gang-feng ZENG,Yue-ping XU,Peng ZHOU,Li LIU. Post-processing of numerical precipitation forecast based on spatial-temporal deep learning model[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(9): 1756-1765.
[12] Zhe YANG,Hong-wei GE,Ting LI. Framework of feature fusion and distribution with mixture of experts for parallel recommendation algorithm[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1317-1325.
[13] Yun-hong LI,Jiao-jiao DUAN,Xue-ping SU,Lei-tao ZHANG,Hui-kang YU,Xing-rui LIU. Calligraphy generation algorithm based on improved generative adversarial network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1326-1334.
[14] Wei QUAN,Yong-qing CAI,Chao WANG,Jia SONG,Hong-kai SUN,Lin-xuan LI. VR sickness estimation model based on 3D-ResNet two-stream network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1345-1353.
[15] Jia-chi ZHAO,Tian-qi WANG,Li-fang ZENG,Xue-ming SHAO. Rapid prediction of unsteady aerodynamic characteristics of flapping wing based on GRU[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1251-1256.