Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2020, Vol. 54 Issue (9): 1753-1760    DOI: 10.3785/j.issn.1008-973X.2020.09.011
    
Detection of DNS tunnels based on log statistics feature
Qi WANG1(),Kun XIE1,Yan MA1,*(),Qun CONG2
1. Information Network Center, Institute of Network Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
2. Beijing Wrdtech Co. Ltd, Beijing 100876, China
Download: HTML     PDF(1001KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

The log of DNS server was used as the data source to extract the multi-dimensional statistical characteristics of the secondary domain name, such as the entropy of the domain, the number of sub domain names, and the cache hit rate. The logs were quantized as feature vector set, which was used as data source. The random forest algorithm was used for model training, the model parameters were adjusted by the method of ten fold cross validation, and the model was optimized to improve the overall detection accuracy. Finally, comparative experiments were made under different classification algorithms, and compared with the existing research methods. The experimental results show that the proposed detection method had an accuracy rate of not less than 90% when the recall rate was 98.5%, and the detection accuracy was improved. Thus, the proposed algorithm can effectively detect DNS tunnel.



Key wordsDNS tunnel      log analysis      DNS cache      random forest      malicious domain name     
Received: 24 September 2019      Published: 22 September 2020
CLC:  TP 302  
Corresponding Authors: Yan MA     E-mail: vinchy_wq@qq.com;mayan@bupt.edu.cn
Cite this article:

Qi WANG,Kun XIE,Yan MA,Qun CONG. Detection of DNS tunnels based on log statistics feature. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1753-1760.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2020.09.011     OR     http://www.zjujournals.com/eng/Y2020/V54/I9/1753


基于日志统计特征的DNS隧道检测

以DNS服务器的日志为数据源,提取出二级域名的熵、子域名个数、缓存命中率等多维日志统计特征, 将日志量化为特征向量集;以特征向量集为数据源,使用随机森林算法进行模型训练,并使用十折交叉验证的方法对模型参数进行调整,对模型进行优化,提高整体检测精度;在不同分类算法下进行对比实验,并将实验结果与已有研究方法进行比较. 实验结果表明,提出的检测方法在召回率达到98.5%的情况下,有不低于90%的准确率,检测精度有所提高,即提出的算法能有效检测DNS隧道.


关键词: DNS隧道,  日志分析,  DNS缓存,  随机森林,  恶意域名 
Fig.1 Flow chart of DNS tunnel detection method based on log statistical characteristics
参数 含义
$E$ 二级域名的熵
${P_{\rm{t}}}$ A/AAAA资源类型查询占比
${C_{\rm{s}}}$ 二级域下子域名的个数
${P_{\rm{s}}}$ 二级域下特异子域名的个数占比
$L$ 域名长度
${L_{{\rm{mvd}}}}$ 最长元音距
${P_{\rm{c}}}$ 缓存命中率
Tab.1 Log features used by proposed detection algorithm
Fig.2 DNS tunnel detection model train flow
混淆矩阵 预测值
正类 负类
实际值 正类 TP FN
负类 FP TN
Tab.2 Confusion matrix
Fig.3 Receiver operating characteristic (ROC) curve sample
Fig.4 Distribution of longest vowel distance between normal flow and tunnel flow
Fig.5 Cache hit ratio distribution of normal traffic and tunnel traffic
分类算法 ${\delta }$ ${R_{\rm{pre} }}$/% ${R_{\rm{re}}}$/%
Logistic回归 0.902 80.7 95.4
朴素贝叶斯 0.818 72.7 90.1
SVM 0.951 93.6 97.2
随机森林 0.991 95.2 98.5
Tab.3 Comparison of experimental results by different classification algorithms
方法名称 ${R_{{\rm{pre}}} }$/% ${R_{\rm{re}} }$/%
本文方法 95.2 98.5
对比实验 88.1 97.9
Tab.4 Comparison of experimental results under different feature dimensions
Fig.6 Comparison of ten cross-validation model scores under different feature dimensions
[1]   DIETRICH C J, ROSSOW C, FREILING F C, et al. On botnets that use DNS for command and control [C] // 2011 Seventh European Conference on Computer Network Defense. Washington: IEEE, 2011: 9-16.
[2]   AHMED J, GHARAKHEILI H H, RAZA Q, et al. Real-time detection of DNS exfiltration and tunneling from enterprise networks [C] // 2019 IFIP/IEEE International Symposium on Integrated Network Management. Washington: IEEE, 2019: 649-653.
[3]   SPATARO J G. Iranian cyber espionage [D]. Utica: College of America, 2019.
[4]   杭特 软件供应链安全风险管控, 任重而道远[J]. 中国信息安全, 2018, 107 (11): 61- 63
HANG Te Software supply chain security risk management and control, there is a long way to go[J]. China Information Security, 2018, 107 (11): 61- 63
doi: 10.3969/j.issn.1674-7844.2018.11.025
[5]   谷传征. DNS协议隐蔽信道的构建和检测技术研究[D]. 上海: 上海交通大学. 2012.
GU Chuan-zheng. Research on the construction and detection technology of covert channel based on DNS protocol[D]. Shanghai: Shanghai Jiao Tong University. 2012.
[6]   YU B, SMITH L, THREEFOOT M, et al. Behavior analysis based DNS tunneling detection and classification with big data technologies [C] // In Proceeding of the International Conference on Internet of Things and Big Data, Rome: SCITEPRESS, 2016: 284-290.
[7]   LIU J, LI S, ZHANG Y, et al. Detecting DNS tunnel through binary-classification based on behavior features [C] // 2017 16th IEEE International Conference on Trust, Security and Privacy in Computing and Communications. Sydney: IEEE, 2017: 339-346.
[8]   LIN H, LIU G, YAN Z. Detection of application-layer tunnels with rules and machine learing [C] // International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage. Atlanta: SPACCS, 2019: 441-455.
[9]   罗友强, 刘胜利, 颜猛, 等 基于通信行为分析的DNS隧道木马检测方法[J]. 浙江大学学报: 工学版, 2017, 51 (9): 1780- 1787
LUO You-qiang LIU Sheng-li, YAN Meng DNS tunnel detection method based on communication behavior analysis[J]. Journal of Zhejiang University: Engineering Science, 2017, 51 (9): 1780- 1787
[10]   NADLER A, AMINOV A, SHABTAI A Detection of malicious and low throughput data exfiltration over the DNS protocol[J]. Computers and Security, 2019, (80): 36- 53
[11]   杨建强, 姜洪溪 基于第二级域名的FQDN个数的DNS隐蔽信道检测[J]. 计算机时代, 2016, (2): 53- 55
YANG Jian-qiang, JIANG Hong-xi Using FQDN number of the second-level domain name to detect DNS-based covert channels[J]. Computer Era, 2016, (2): 53- 55
[12]   HERRMANN D, BANSE C, FEDERRATH H Behavior-based tracking: exploiting characteristic patterns in DNS traffic[J]. Computers and Security, 2013, (39): 17- 33
[13]   PAUL A, LIU C. Dns and Bind, Fifth Edition[M]. Beijing: Beijing Posts and Telecom Press, 2014.
[14]   云解析小二. 阿里DNS: 一种不断变化前缀域名攻击检测方法[EB/OL]. (2018-11-12)[2019-7-30]. https://yq.aliyun.com/articles/672435, 2018-11-12.
[15]   赵越. 基于DNS流量特征的僵尸网络检测方法研究 [D]. 天津大学. 2015.
ZHAO Yue. A study on botnet detection method based on DNS flow characteristics [D]. Tianjin: Tianjin University, 2015.
[16]   SPOOREN J, PREUVENEERS D, DESMET L, et al. Detection of algorithmically generated domain names used by botnets: a dual arms race [C] // Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. New York: ACM, 2019: 1916–1923.
[17]   CROTTI M, DUSI M, GRINGOLI F, et al. Detecting HTTP tunnels with statistical mechanisms [C] // IEEE International Conference on Communication. Glasgow: IEEE, 2007: 6162-6168.
[18]   HOANG X D, NGUYEN Q C Botnet detection based on machine learning techniques using DNS query data[J]. Future Internet, 2018, 10 (5): 43
doi: 10.3390/fi10050043
[19]   徐琨. DNS隐蔽通道检测技术研究 [D]. 成都: 西南交通大学, 2017.
XU Kun. Research on DNS covert channel detection technology [D]. Chengdu: Souchwest Jiaotong University, 2017.
[20]   YAN P, YAN Z A Survey on dynamic mobile malware detection[J]. Software Quality Journal, 2018, 26 (3): 891- 919
doi: 10.1007/s11219-017-9368-4
[1] Jiang-kuan XING,Hai-ou WANG,Kun LUO,Yun BAI,Jian-ren FAN. Random forest model for predicting kinetic parameters of biomass devolatilization[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 605-612.
[2] Xue-ying BAO,Yu-xi ZHENG,Qi-cai WANG. Construction group comprehensive bearing capacity analysis of deep cutting under green construction[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 482-491.
[3] Dong-xiang KE,Li-min PAN,Sen-lin LUO,Han-qing ZHANG. Android malicious behavior recognition and classification method based on random forest algorithm[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(10): 2013-2023.
[4] LUO You-qiang, LIU Sheng-li, YAN Meng, WU Dong-ying. DNS tunnel Trojan detection method based on communication behavior analysis[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(9): 1780-1787.
[5] SHANG Qiang, LIN Ci-yun, YANG Zhao-sheng, BING Qi-chun, XING Ru-ru. Traffic incident detection based on variable selection and kernel extreme learning machine[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(7): 1339-1346.
[6] DONG Li yan, ZHU Qi, LI Yong li. Model combination algorithm based on consensus maximization[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(2): 416-421.
[7] TANG You bao, BU Wei, WU Xiang qian. Natural scene text detection based on multi level MSER[J]. Journal of ZheJiang University (Engineering Science), 2016, 50(6): 1134-1140.