Software vulnerable code reuse detection method based on vulnerability fingerprint

doi:10.3785/j.issn.1008-973X.2018.11.017

JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE)

2018, Vol. 52

Issue (11): 2180-2190 DOI: 10.3785/j.issn.1008-973X.2018.11.017

Computer Technology

Software vulnerable code reuse detection method based on vulnerability fingerprint

LIU Zhen, WU Ze-hui, CAO Yan, WEI Qiang

State Key Laboratory of Mathematical Engineering and Advanced Computing, Information Engineering University, Zhengzhou 450001, China

Download:

PDF(988KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A detection method for vulnerable code reuse based on vulnerability fingerprint was proposed to reduce the false negative rate of traditional methods. The structure of the vulnerability patch on open source projects and the feature of vulnerable code were analyzed, the common methods of code reuse were summarized and the fingerprint model based on hash value was presented. Code preprocessing was introduced to reduce the influence of irrelevant factors. The code block with fixed line number was used as the basic unit for feature abstraction and the hash algorithm was introduced to extract features from the code. The vulnerability instance database was established by collecting vulnerability details and relevant codes in open source project. The LCS-based similarity measuring method was employed to locate the reuse of the instance and mark them as sensitive codes. Under the instruction of the judging strategy, the vulnerability fingerprint was applied to identify vulnerable code reuse among the sensitive codes accurately. The experimental results showed that the proposed method can deal with the impact of the commonly used code modification methods effectively as well as improve the efficiency obviously, and there was linear dependence between the time cost and the amount of input code.

Received: 12 December 2017 Published: 22 November 2018

CLC:

TP393

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors

Cite this article:

LIU Zhen, WU Ze-hui, CAO Yan, WEI Qiang. Software vulnerable code reuse detection method based on vulnerability fingerprint. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(11): 2180-2190.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2018.11.017 OR http://www.zjujournals.com/eng/Y2018/V52/I11/2180

基于漏洞指纹的软件脆弱性代码复用检测方法

针对传统脆弱性代码复用检测技术漏报率高的问题，提出基于漏洞指纹的检测方法.分析开源项目漏洞补丁的结构与脆弱性代码特征，总结代码复用过程中常见修改手段的特点，设计基于哈希值的漏洞指纹模型.开展代码预处理消除无关因素的影响，选取固定行数的代码块作为特征抽象粒度，利用哈希算法抽取关键代码特征.通过搜集开源项目漏洞信息与相关代码片段构建漏洞样本库，利用基于LCS的相似性评估算法定位漏洞样本的复用并且标记为敏感代码，使用漏洞指纹进行检测并根据识别策略完成对脆弱性代码的判定.实验结果表明，基于漏洞指纹的检测方法能够有效地应对多种代码修改手段的影响，明显提高检测效率，检测时间与输入代码量呈线性增长关系.

[1] LOPES C V, MAJ P, MARTINS P, et al. DejaVu:a map of code duplicates on GitHub[J/OL]. Proceedings of the ACM on Programming Languages, 2017, 1(OOPSLA):84[2017-11-28] . http://delivery.acm.org/10.1145/3140000/3133908/oopsla17-oopsla176.pdf.
[2] BLACK DUCK SOFTWARE. 2017 Open source security and risk analysis report[R/OL]. (2017-4-18)[2017-11-28] . https://www.blackducksoftware.com/open-source-security-risk-analysis-2017.
[3] KOROLOV M. Attacks based on open source vulnerabilities will rise 20 percent this year[R/OL]. (2017-1-17)[2017-11-28] . http://www.csoonline.com/article/3157377/application-development/report-attacks-based-on-open-source-vulnerabilities-will-rise-20-percent-this-year.html.
[4] JONES E L. Metrics based plagiarism monitoring[J]. Journal of Computing Sciences in Colleges, 2001, 16(4):253-261.
[5] ROY C K, CORDY J R, KOSCHKE R, et al. Comparison and evaluation of code clone detection techniques and tools:a qualitative approach[J]. Science of Computer Programming, 2009, 74(7):470-495.
[6] JANG J, ABEER A, DAVID B. ReDeBug:finding unpatched code clones in entire OS distributions[C]//IEEE Symposium on Security and Privacy. San Francisco:IEEE, 2012:48-62.
[7] LI H, KWON H, KWON J, et al. CLORIFI:software vulnerability discovery using code clone verification[J]. Concurrency and Computation Practice and Experience, 2016, 28(6):1900-1917.
[8] 甘水滔, 秦晓军, 陈左宁, 等. 一种基于特征矩阵的软件脆弱性代码克隆检测方法[J]. 软件学报, 2015, 26(2):348-363 GAN Shui-tao, QIN Xiao-jun, CHEN Zuo-ning, et al. Software vulnerability code clone detection method based on characteristic metrics[J]. Journal of Software, 2015, 26(2):348-363
[9] LI Z, ZOU D, XU S, et al. VulPecker:an automated vulnerability detection system based on code similarity analysis[C]//Conference on Computer Security Applications. Los Angeles:ACM, 2016:201-213.
[10] KIM S, WOO S, LEE H, et al. VUDDY:a scalable approach for vulnerable code clone discovery[C]//Security and Privacy. San Jose:IEEE, 2017:595-614.
[11] ROY C K, CORDY J R. NICAD:accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization[C]//The 16th IEEE International Conference on Program Comprehension. Amsterdam:IEEE, 2008:172-181.
[12] KAWAMITSU N, ISHIO T, KANDA T, et al. Identifying source code reuse across repositories using LCS-based source code similarity[C]//International Working Conference on Source Code Analysis and Manipulation. Victoria:IEEE, 2014:305-314.
[13] 常超, 刘克胜, 赵军, 等. 基于复用代码检测的缺陷发现方法[J]. 系统工程与电子技术, 2017, 9(39):2157-2164 CHANG Chao, LIU Ke-sheng, ZHAO Jun, et al. Clone flaw detection method based on clone code detection[J]. Systems Engineering and Electronics, 2017, 9(39):2157-2164
[14] PARR T. Another tool for language recognition[EB/OL]. (2013-5-20)[2017-11-28] . http://www.antlr.org/
[15] 田振洲, 刘烃, 郑庆华, 等. 软件抄袭检测研究综述[J]. 信息安全学报, 2016(3):52-76 TIAN Zhen-zhou, LIU Ting, ZHENG Qing-hua, et al. Software plagiarism detection:a survey[J]. Journal of Cyber Security, 2016(3):52-76
[16] LI Z, LU S, MYAGMAR S, et al. CP-Miner:finding copy-paste and related bugs in large-scale software code[J]. IEEE Transactions on Software Engineering, 2006, 32(3):176-192.
[17] SAJNANI H, SAINI V, SVAJLENKO J, et al. SourcererCC:scaling code clone detection to big-code[C]//38th International Conference on Software Engineering. Austin:ACM, 2016:1157-1168.

[1]	LIU Wei-lun, ZHANG Heng-yang, ZHENG Bo, GAO Wei-ting. Multi-channel media access control protocol with differential services in airborne network[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2019, 53(1): 99-106.

[2]	LAI Xiao-han, WEN Hao-xiang, CHEN Long-dao. Energy efficient routing for wireless sensor networks in intertidal environment[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(12): 2414-2422.

[3]	QI Xiao-gang, WANG Zhen-yu, LIU Li-fang, LIU Xing-cheng, MA Jiu-long. Reliable and efficient routing of wireless sensors and actuator networks[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(10): 1964-1972.

[4]	HU Gang, XU Xiang Xiang, GUO Xiu-cheng. Importance calculation of complex network nodes based on interpretive structural modeling method[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(10): 1989-1997.

[5]	REN Zhi-yuan, HOU Xiang-wang, GUO Kai, ZHANG Hai-lin, CHEN Chen. Distributed satellite cloud-fog network and strategy of latency and power consumption[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(8): 1474-1481.

[6]	JIA Wen-chao, HU Rong-gui, SHI Fan, XU Cheng-xi. Injection vulnerability threat detection method with multi-feature correlation[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(3): 524-530.

[7]	LI Bing, JIN Tao, CHEN Shuai. Method to improve reliability of SRAM PUFs key generation[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(1): 133-141.

[8]	YU Yang, XIA Chun-he, HU Xiao-yun. Defense scheme generation method using mixed path attack graph[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(9): 1745-1759.

[9]	LUO You-qiang, LIU Sheng-li, YAN Meng, WU Dong-ying. DNS tunnel Trojan detection method based on communication behavior analysis[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(9): 1780-1787.

[10]	YIN Ge-Ting, ZHOU Bei, ZHANG Shuai, XU Bin, CHEN Yi-Xi, JIANG Dan. QoS-based bottom-up service replacement for Web service composition[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2010, 44(4): 700-709.

[11]	WANG Rui-Qin, KONG Fan-Qing, BO Dun. Unsupervised word sense disambiguation based on WordNet[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2010, 44(4): 732-737.

[12]	ZHOU Jiang, YING Jing, TUN Meng-Hui. Multifactor prediction routing protocol based on characteristic categorization of opportunistic networks[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2010, 44(3): 413-419.

[13]	OU Yang-Yang, CHEN Yu-Feng, CHEN Xi-Yuan, et al. Ontology modeling of domain knowledge in semantic learning Web[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2009, 43(09): 1591-1596.

[14]	KONG Xiang-Jie, CHEN Guo-Jiang, LIANG Tong-Hai. Intelligent coordinated control of traffic flow on road network with bus-priority[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2009, 43(6): 1026-1031.

[15]	WANG Jian, SUN Jian-Ling, WANG Xin-Yu, et al. Partial preemptive real-time scheduling algorithm in software fault-tolerant model[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2009, 43(6): 1047-1052.

Viewed

Full text

Abstract

Cited

Shared

Discussed