Please wait a minute...
浙江大学学报(工学版)  2018, Vol. 52 Issue (11): 2180-2190    DOI: 10.3785/j.issn.1008-973X.2018.11.017
计算机技术     
基于漏洞指纹的软件脆弱性代码复用检测方法
刘臻, 武泽慧, 曹琰, 魏强
信息工程大学 数学工程与先进计算国家重点实验室, 河南 郑州 450001
Software vulnerable code reuse detection method based on vulnerability fingerprint
LIU Zhen, WU Ze-hui, CAO Yan, WEI Qiang
State Key Laboratory of Mathematical Engineering and Advanced Computing, Information Engineering University, Zhengzhou 450001, China
 全文: PDF(988 KB)   HTML
摘要:

针对传统脆弱性代码复用检测技术漏报率高的问题,提出基于漏洞指纹的检测方法.分析开源项目漏洞补丁的结构与脆弱性代码特征,总结代码复用过程中常见修改手段的特点,设计基于哈希值的漏洞指纹模型.开展代码预处理消除无关因素的影响,选取固定行数的代码块作为特征抽象粒度,利用哈希算法抽取关键代码特征.通过搜集开源项目漏洞信息与相关代码片段构建漏洞样本库,利用基于LCS的相似性评估算法定位漏洞样本的复用并且标记为敏感代码,使用漏洞指纹进行检测并根据识别策略完成对脆弱性代码的判定.实验结果表明,基于漏洞指纹的检测方法能够有效地应对多种代码修改手段的影响,明显提高检测效率,检测时间与输入代码量呈线性增长关系.

Abstract:

A detection method for vulnerable code reuse based on vulnerability fingerprint was proposed to reduce the false negative rate of traditional methods. The structure of the vulnerability patch on open source projects and the feature of vulnerable code were analyzed, the common methods of code reuse were summarized and the fingerprint model based on hash value was presented. Code preprocessing was introduced to reduce the influence of irrelevant factors. The code block with fixed line number was used as the basic unit for feature abstraction and the hash algorithm was introduced to extract features from the code. The vulnerability instance database was established by collecting vulnerability details and relevant codes in open source project. The LCS-based similarity measuring method was employed to locate the reuse of the instance and mark them as sensitive codes. Under the instruction of the judging strategy, the vulnerability fingerprint was applied to identify vulnerable code reuse among the sensitive codes accurately. The experimental results showed that the proposed method can deal with the impact of the commonly used code modification methods effectively as well as improve the efficiency obviously, and there was linear dependence between the time cost and the amount of input code.

收稿日期: 2017-12-12 出版日期: 2018-11-22
CLC:  TP393  
基金资助:

国家重点研发计划基金资助项目(2017YFB0802900)

通讯作者: 魏强,男,副教授.orcid.org/0000-0002-4749-1447.     E-mail: funnywei@163.com
作者简介: 刘臻(1993-),男,硕士生,从事软件工程、网络安全研究.orcid.org/0000-0002-9740-6933.E-mail:sandikast@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  

引用本文:

刘臻, 武泽慧, 曹琰, 魏强. 基于漏洞指纹的软件脆弱性代码复用检测方法[J]. 浙江大学学报(工学版), 2018, 52(11): 2180-2190.

LIU Zhen, WU Ze-hui, CAO Yan, WEI Qiang. Software vulnerable code reuse detection method based on vulnerability fingerprint. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(11): 2180-2190.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2018.11.017        http://www.zjujournals.com/eng/CN/Y2018/V52/I11/2180

[1] LOPES C V, MAJ P, MARTINS P, et al. DejaVu:a map of code duplicates on GitHub[J/OL]. Proceedings of the ACM on Programming Languages, 2017, 1(OOPSLA):84[2017-11-28] . http://delivery.acm.org/10.1145/3140000/3133908/oopsla17-oopsla176.pdf.
[2] BLACK DUCK SOFTWARE. 2017 Open source security and risk analysis report[R/OL]. (2017-4-18)[2017-11-28] . https://www.blackducksoftware.com/open-source-security-risk-analysis-2017.
[3] KOROLOV M. Attacks based on open source vulnerabilities will rise 20 percent this year[R/OL]. (2017-1-17)[2017-11-28] . http://www.csoonline.com/article/3157377/application-development/report-attacks-based-on-open-source-vulnerabilities-will-rise-20-percent-this-year.html.
[4] JONES E L. Metrics based plagiarism monitoring[J]. Journal of Computing Sciences in Colleges, 2001, 16(4):253-261.
[5] ROY C K, CORDY J R, KOSCHKE R, et al. Comparison and evaluation of code clone detection techniques and tools:a qualitative approach[J]. Science of Computer Programming, 2009, 74(7):470-495.
[6] JANG J, ABEER A, DAVID B. ReDeBug:finding unpatched code clones in entire OS distributions[C]//IEEE Symposium on Security and Privacy. San Francisco:IEEE, 2012:48-62.
[7] LI H, KWON H, KWON J, et al. CLORIFI:software vulnerability discovery using code clone verification[J]. Concurrency and Computation Practice and Experience, 2016, 28(6):1900-1917.
[8] 甘水滔, 秦晓军, 陈左宁, 等. 一种基于特征矩阵的软件脆弱性代码克隆检测方法[J]. 软件学报, 2015, 26(2):348-363 GAN Shui-tao, QIN Xiao-jun, CHEN Zuo-ning, et al. Software vulnerability code clone detection method based on characteristic metrics[J]. Journal of Software, 2015, 26(2):348-363
[9] LI Z, ZOU D, XU S, et al. VulPecker:an automated vulnerability detection system based on code similarity analysis[C]//Conference on Computer Security Applications. Los Angeles:ACM, 2016:201-213.
[10] KIM S, WOO S, LEE H, et al. VUDDY:a scalable approach for vulnerable code clone discovery[C]//Security and Privacy. San Jose:IEEE, 2017:595-614.
[11] ROY C K, CORDY J R. NICAD:accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization[C]//The 16th IEEE International Conference on Program Comprehension. Amsterdam:IEEE, 2008:172-181.
[12] KAWAMITSU N, ISHIO T, KANDA T, et al. Identifying source code reuse across repositories using LCS-based source code similarity[C]//International Working Conference on Source Code Analysis and Manipulation. Victoria:IEEE, 2014:305-314.
[13] 常超, 刘克胜, 赵军, 等. 基于复用代码检测的缺陷发现方法[J]. 系统工程与电子技术, 2017, 9(39):2157-2164 CHANG Chao, LIU Ke-sheng, ZHAO Jun, et al. Clone flaw detection method based on clone code detection[J]. Systems Engineering and Electronics, 2017, 9(39):2157-2164
[14] PARR T. Another tool for language recognition[EB/OL]. (2013-5-20)[2017-11-28] . http://www.antlr.org/
[15] 田振洲, 刘烃, 郑庆华, 等. 软件抄袭检测研究综述[J]. 信息安全学报, 2016(3):52-76 TIAN Zhen-zhou, LIU Ting, ZHENG Qing-hua, et al. Software plagiarism detection:a survey[J]. Journal of Cyber Security, 2016(3):52-76
[16] LI Z, LU S, MYAGMAR S, et al. CP-Miner:finding copy-paste and related bugs in large-scale software code[J]. IEEE Transactions on Software Engineering, 2006, 32(3):176-192.
[17] SAJNANI H, SAINI V, SVAJLENKO J, et al. SourcererCC:scaling code clone detection to big-code[C]//38th International Conference on Software Engineering. Austin:ACM, 2016:1157-1168.

[1] 刘炜伦, 张衡阳, 郑博, 高维廷. 优先级区分服务的机载网络媒质接入控制协议[J]. 浙江大学学报(工学版), 2019, 53(1): 99-106.
[2] 赖晓翰, 文昊翔, 陈隆道. 潮间带无线传感器网络路由算法[J]. 浙江大学学报(工学版), 2018, 52(12): 2414-2422.
[3] 齐小刚, 王振宇, 刘立芳, 刘兴成, 马久龙. 无线传感器和执行器网络可靠高效路由[J]. 浙江大学学报(工学版), 2018, 52(10): 1964-1972.
[4] 胡钢, 徐翔, 过秀成. 基于解释结构模型的复杂网络节点重要性计算[J]. 浙江大学学报(工学版), 2018, 52(10): 1989-1997.
[5] 任智源, 侯向往, 郭凯, 张海林, 陈晨. 分布式卫星云雾网络及时延与能耗策略[J]. 浙江大学学报(工学版), 2018, 52(8): 1474-1481.
[6] 贾文超, 胡荣贵, 施凡, 许成喜. 多特征关联的注入型威胁检测方法[J]. 浙江大学学报(工学版), 2018, 52(3): 524-530.
[7] 李冰, 金涛, 陈帅. 提高SRAM PUFs密钥生成可靠性的方法[J]. 浙江大学学报(工学版), 2018, 52(1): 133-141.
[8] 余洋, 夏春和, 胡潇云. 采用混和路径攻击图的防御方案生成方法[J]. 浙江大学学报(工学版), 2017, 51(9): 1745-1759.
[9] 罗友强, 刘胜利, 颜猛, 武东英. 基于通信行为分析的DNS隧道木马检测方法[J]. 浙江大学学报(工学版), 2017, 51(9): 1780-1787.
[10] 尹可挺, 周波, 张帅, 徐斌, 陈一稀, 江丹. Web服务组合中基于QoS的自底向上服务替换[J]. J4, 2010, 44(4): 700-709.
[11] 王瑞琴, 孔繁胜, 潘俊. 基于WordNet的无导词义消歧方法[J]. J4, 2010, 44(4): 732-737.
[12] 周强, 应晶, 吴明晖. 基于特征分类的机会网络多因素预测路由[J]. J4, 2010, 44(3): 413-419.
[13] 欧阳杨, 陈宇峰, 陈溪源, 等. 教育语义网中的知识领域本体建模[J]. J4, 2009, 43(09): 1591-1596.
[14] 孔祥杰, 沈国江, 梁同海. 具有公交优先的路网交通流智能协调控制[J]. J4, 2009, 43(6): 1026-1031.
[15] 王健, 孙建伶, 王新宇, 等. 软件容错模型中的部分抢占实时调度算法[J]. J4, 2009, 43(6): 1047-1052.