Please wait a minute...
浙江大学学报(工学版)  2020, Vol. 54 Issue (9): 1753-1760    DOI: 10.3785/j.issn.1008-973X.2020.09.011
计算机技术     
基于日志统计特征的DNS隧道检测
王琪1(),谢坤1,马严1,*(),丛群2
1. 北京邮电大学 网络技术研究院,信息网络中心,北京 100876
2. 北京网瑞达科技有限公司,北京 100876
Detection of DNS tunnels based on log statistics feature
Qi WANG1(),Kun XIE1,Yan MA1,*(),Qun CONG2
1. Information Network Center, Institute of Network Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
2. Beijing Wrdtech Co. Ltd, Beijing 100876, China
 全文: PDF(1001 KB)   HTML
摘要:

以DNS服务器的日志为数据源,提取出二级域名的熵、子域名个数、缓存命中率等多维日志统计特征, 将日志量化为特征向量集;以特征向量集为数据源,使用随机森林算法进行模型训练,并使用十折交叉验证的方法对模型参数进行调整,对模型进行优化,提高整体检测精度;在不同分类算法下进行对比实验,并将实验结果与已有研究方法进行比较. 实验结果表明,提出的检测方法在召回率达到98.5%的情况下,有不低于90%的准确率,检测精度有所提高,即提出的算法能有效检测DNS隧道.

关键词: DNS隧道日志分析DNS缓存随机森林恶意域名    
Abstract:

The log of DNS server was used as the data source to extract the multi-dimensional statistical characteristics of the secondary domain name, such as the entropy of the domain, the number of sub domain names, and the cache hit rate. The logs were quantized as feature vector set, which was used as data source. The random forest algorithm was used for model training, the model parameters were adjusted by the method of ten fold cross validation, and the model was optimized to improve the overall detection accuracy. Finally, comparative experiments were made under different classification algorithms, and compared with the existing research methods. The experimental results show that the proposed detection method had an accuracy rate of not less than 90% when the recall rate was 98.5%, and the detection accuracy was improved. Thus, the proposed algorithm can effectively detect DNS tunnel.

Key words: DNS tunnel    log analysis    DNS cache    random forest    malicious domain name
收稿日期: 2019-09-24 出版日期: 2020-09-22
CLC:  TP 302  
基金资助: 中央高校基本科研专项资金资助项目(2019RC53);国家CNGI专项资助项目(CNGI-12-03-001)
通讯作者: 马严     E-mail: vinchy_wq@qq.com;mayan@bupt.edu.cn
作者简介: 王琪(1995—),女,硕士生,从事网络管理研究. orcid.org/0000-0003-0052-5360. E-mail: vinchy_wq@qq.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
王琪
谢坤
马严
丛群

引用本文:

王琪,谢坤,马严,丛群. 基于日志统计特征的DNS隧道检测[J]. 浙江大学学报(工学版), 2020, 54(9): 1753-1760.

Qi WANG,Kun XIE,Yan MA,Qun CONG. Detection of DNS tunnels based on log statistics feature. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1753-1760.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2020.09.011        http://www.zjujournals.com/eng/CN/Y2020/V54/I9/1753

图 1  基于日志统计特征DNS隧道检测方法流程图
参数 含义
$E$ 二级域名的熵
${P_{\rm{t}}}$ A/AAAA资源类型查询占比
${C_{\rm{s}}}$ 二级域下子域名的个数
${P_{\rm{s}}}$ 二级域下特异子域名的个数占比
$L$ 域名长度
${L_{{\rm{mvd}}}}$ 最长元音距
${P_{\rm{c}}}$ 缓存命中率
表 1  所提检测算法使用的日志特征
图 2  DNS隧道检测模型训练流程
混淆矩阵 预测值
正类 负类
实际值 正类 TP FN
负类 FP TN
表 2  混淆矩阵
图 3  接受者操作特征(ROC)曲线图例
图 4  正常流量与隧道流量的最长元音距分布
图 5  正常流量与隧道流量的缓存命中率分布
分类算法 ${\delta }$ ${R_{\rm{pre} }}$/% ${R_{\rm{re}}}$/%
Logistic回归 0.902 80.7 95.4
朴素贝叶斯 0.818 72.7 90.1
SVM 0.951 93.6 97.2
随机森林 0.991 95.2 98.5
表 3  不同分类算法的实验结果对比
方法名称 ${R_{{\rm{pre}}} }$/% ${R_{\rm{re}} }$/%
本文方法 95.2 98.5
对比实验 88.1 97.9
表 4  不同特征维度下的实验结果对比
图 6  不同特征维度下的十折叉验证模型评分对比
1 DIETRICH C J, ROSSOW C, FREILING F C, et al. On botnets that use DNS for command and control [C] // 2011 Seventh European Conference on Computer Network Defense. Washington: IEEE, 2011: 9-16.
2 AHMED J, GHARAKHEILI H H, RAZA Q, et al. Real-time detection of DNS exfiltration and tunneling from enterprise networks [C] // 2019 IFIP/IEEE International Symposium on Integrated Network Management. Washington: IEEE, 2019: 649-653.
3 SPATARO J G. Iranian cyber espionage [D]. Utica: College of America, 2019.
4 杭特 软件供应链安全风险管控, 任重而道远[J]. 中国信息安全, 2018, 107 (11): 61- 63
HANG Te Software supply chain security risk management and control, there is a long way to go[J]. China Information Security, 2018, 107 (11): 61- 63
doi: 10.3969/j.issn.1674-7844.2018.11.025
5 谷传征. DNS协议隐蔽信道的构建和检测技术研究[D]. 上海: 上海交通大学. 2012.
GU Chuan-zheng. Research on the construction and detection technology of covert channel based on DNS protocol[D]. Shanghai: Shanghai Jiao Tong University. 2012.
6 YU B, SMITH L, THREEFOOT M, et al. Behavior analysis based DNS tunneling detection and classification with big data technologies [C] // In Proceeding of the International Conference on Internet of Things and Big Data, Rome: SCITEPRESS, 2016: 284-290.
7 LIU J, LI S, ZHANG Y, et al. Detecting DNS tunnel through binary-classification based on behavior features [C] // 2017 16th IEEE International Conference on Trust, Security and Privacy in Computing and Communications. Sydney: IEEE, 2017: 339-346.
8 LIN H, LIU G, YAN Z. Detection of application-layer tunnels with rules and machine learing [C] // International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage. Atlanta: SPACCS, 2019: 441-455.
9 罗友强, 刘胜利, 颜猛, 等 基于通信行为分析的DNS隧道木马检测方法[J]. 浙江大学学报: 工学版, 2017, 51 (9): 1780- 1787
LUO You-qiang LIU Sheng-li, YAN Meng DNS tunnel detection method based on communication behavior analysis[J]. Journal of Zhejiang University: Engineering Science, 2017, 51 (9): 1780- 1787
10 NADLER A, AMINOV A, SHABTAI A Detection of malicious and low throughput data exfiltration over the DNS protocol[J]. Computers and Security, 2019, (80): 36- 53
11 杨建强, 姜洪溪 基于第二级域名的FQDN个数的DNS隐蔽信道检测[J]. 计算机时代, 2016, (2): 53- 55
YANG Jian-qiang, JIANG Hong-xi Using FQDN number of the second-level domain name to detect DNS-based covert channels[J]. Computer Era, 2016, (2): 53- 55
12 HERRMANN D, BANSE C, FEDERRATH H Behavior-based tracking: exploiting characteristic patterns in DNS traffic[J]. Computers and Security, 2013, (39): 17- 33
13 PAUL A, LIU C. Dns and Bind, Fifth Edition[M]. Beijing: Beijing Posts and Telecom Press, 2014.
14 云解析小二. 阿里DNS: 一种不断变化前缀域名攻击检测方法[EB/OL]. (2018-11-12)[2019-7-30]. https://yq.aliyun.com/articles/672435, 2018-11-12.
15 赵越. 基于DNS流量特征的僵尸网络检测方法研究 [D]. 天津大学. 2015.
ZHAO Yue. A study on botnet detection method based on DNS flow characteristics [D]. Tianjin: Tianjin University, 2015.
16 SPOOREN J, PREUVENEERS D, DESMET L, et al. Detection of algorithmically generated domain names used by botnets: a dual arms race [C] // Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. New York: ACM, 2019: 1916–1923.
17 CROTTI M, DUSI M, GRINGOLI F, et al. Detecting HTTP tunnels with statistical mechanisms [C] // IEEE International Conference on Communication. Glasgow: IEEE, 2007: 6162-6168.
18 HOANG X D, NGUYEN Q C Botnet detection based on machine learning techniques using DNS query data[J]. Future Internet, 2018, 10 (5): 43
doi: 10.3390/fi10050043
19 徐琨. DNS隐蔽通道检测技术研究 [D]. 成都: 西南交通大学, 2017.
XU Kun. Research on DNS covert channel detection technology [D]. Chengdu: Souchwest Jiaotong University, 2017.
20 YAN P, YAN Z A Survey on dynamic mobile malware detection[J]. Software Quality Journal, 2018, 26 (3): 891- 919
doi: 10.1007/s11219-017-9368-4
[1] 邢江宽,王海鸥,罗坤,白云,樊建人. 预测生物质热解动力学参数的随机森林模型[J]. 浙江大学学报(工学版), 2019, 53(3): 605-612.
[2] 鲍学英,郑雨茜,王起才. 绿色施工下的深路堑施工群综合承载度分析[J]. 浙江大学学报(工学版), 2019, 53(3): 482-491.
[3] 柯懂湘,潘丽敏,罗森林,张寒青. 基于随机森林算法的Android恶意行为识别与分类方法[J]. 浙江大学学报(工学版), 2019, 53(10): 2013-2023.
[4] 罗友强, 刘胜利, 颜猛, 武东英. 基于通信行为分析的DNS隧道木马检测方法[J]. 浙江大学学报(工学版), 2017, 51(9): 1780-1787.
[5] 商强, 林赐云, 杨兆升, 邴其春, 邢茹茹. 基于变量选择和核极限学习机的交通事件检测[J]. 浙江大学学报(工学版), 2017, 51(7): 1339-1346.
[6] 董立岩, 朱琪, 李永丽. 基于最大共识的模型组合算法[J]. 浙江大学学报(工学版), 2017, 51(2): 416-421.
[7] 唐有宝, 卜巍, 邬向前. 多层次MSER自然场景文本检测[J]. 浙江大学学报(工学版), 2016, 50(6): 1134-1140.