Please wait a minute...
浙江大学学报(工学版)  2020, Vol. 54 Issue (9): 1761-1767    DOI: 10.3785/j.issn.1008-973X.2020.09.012
计算机技术     
园区网风险账号评估方法
曾煌尧1(),李丹丹1,马严1,*(),丛群2
1. 北京邮电大学 网络技术研究院,北京 100876
2. 北京网瑞达科技有限公司,北京 100082
Risky accounts evaluation method of campus network
Huang-yao ZENG1(),Dan-dan LI1,Yan MA1,*(),Qun CONG2
1. Information Network Center, Institute of Network Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
2. Beijing Wrdtech Co. Ltd, Beijing 100082, China
 全文: PDF(1156 KB)   HTML
摘要:

基于账号的URL访问日志,通过检测风险设备定位风险账号;提取设备出现次数离散度、设备多账号风险度、收费网络占比等访问行为特征,将其量化为特征向量集;利用高斯混合模型(GMM)将所得到的特征向量集进行聚类,得出设备有异常访问行为的概率. 使用修正余弦相似度算法计算同一账号下同类设备访问URL的相似程度. 综合高斯混合模型的聚类结果和修正余弦相似度的计算结果得到风险账号的评估结果. 实验结果表明,该方法在误报率低于5%的同时达到85%的检出率,可以在IP地址范围较小、账号登录频率不高的园区网环境下及时发现风险账号.

关键词: 统一资源定位符(URL)园区网风险评估高斯混合模型(GMM)余弦相似度    
Abstract:

The proposed method located risky accounts by detecting risky devices based on the URL access logs of the accounts; and the access behavior characteristics, such as the dispersion of device occurrences, the device multi-account risk level, and the percentage of charged networks, were extracted and quantified into feature vector sets. The set of feature vectors was clustered using a Gaussian mixed model (GMM) to obtain the probability of abnormal device access behavior. The similarity of URLs accessed by similar devices under the same account was calculated with the modified cosine similarity algorithm. The results of GMM and the modified cosine similarity were combined to give the evaluation results of risky accounts. The experimental results show that the method can achieve the detection rate of 85% with the false alarm rate of less than 5%, which helps to detect risky accounts promptly in campus network environment with a small range of IP addresses and infrequent account logins.

Key words: uniform resource locator (URL)    campus network    risk assessment    Gaussian mixture model (GMM)    cosine similarity
收稿日期: 2019-07-30 出版日期: 2020-09-22
CLC:  TP 302  
基金资助: 中央高校基本科研专项资金资助项目(2018RC21);国家CNGI专项资助项目(CNGI-12-03-001)
通讯作者: 马严     E-mail: molunerfinn@gmail.com;mayan@bupt.edu.cn
作者简介: 曾煌尧(1995—)男,硕士生,从事网络空间安全研究. orcid.org/0000-0002-8278-9695. E-mail: molunerfinn@gmail.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
曾煌尧
李丹丹
马严
丛群

引用本文:

曾煌尧,李丹丹,马严,丛群. 园区网风险账号评估方法[J]. 浙江大学学报(工学版), 2020, 54(9): 1761-1767.

Huang-yao ZENG,Dan-dan LI,Yan MA,Qun CONG. Risky accounts evaluation method of campus network. Journal of ZheJiang University (Engineering Science), 2020, 54(9): 1761-1767.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2020.09.012        http://www.zjujournals.com/eng/CN/Y2020/V54/I9/1761

字段 含义
TIME 访问时间
LABEL 访问标签
MAC 设备MAC地址
URL 访问URL地址
DEVICE 设备类型
POS 设备访问地理位置信息
USER 用户账号
IP 设备访问的IP地址
SSID 设备访问的服务集标识
表 1  用户URL访问日志数据集的存储字段
真实结果/聚类结果 有风险 无风险
有风险 TP=11 737 FN=2 044
无风险 FP=11 573 TN=73 121
表 2  高斯混合模型(GMM)聚类结果
图 1  随机抽取的2 000个样本在设备出现次数离散度上的取值情况
图 2  随机抽取的2 000个样本在设备多账号风险度上的取值情况
图 3  随机抽取的2 000个样本在收费网络占比上的取值情况
图 4  随机抽取的2 000个样本在对立位置风险度上的占比情况
图 5  随机抽取的2 000个样本在设备访问URL相似度上的取值情况
1 白阳 高校园区网的规划与构建[J]. 航海教育研究, 2010, 27 (1): 111- 112
BAI Yang Planning and construction of university campus network[J]. Maritime Education Research, 2010, 27 (1): 111- 112
doi: 10.3969/j.issn.1006-8724.2010.01.043
2 WANG D, WANG P Two birds with one stone: two-factor authentication with security beyond conventional bound[J]. IEEE Transactions on Dependable and Secure Computing, 2018, 15 (4): 708- 722
3 MILLS J U, STUBAN S M F, DEVER J Predict insider threats using human behaviors[J]. IEEE Engineering Management Review, 2017, 45 (1): 39- 48
doi: 10.1109/EMR.2017.2667218
4 SIADATI H, SAKET B, MEMON N. Detecting malicious logins in enterprise networks using visualization [C] // 2016 IEEE Symposium on Visualization for Cyber Security (VizSec). Baltimore: IEEE, 2016: 1-8.
5 ZHOU Y, KIM D W, ZHANG J, et al Proguard: detecting malicious accounts in social-network-based online promotions[J]. IEEE Access, 2017, 5: 1990- 1999
doi: 10.1109/ACCESS.2017.2654272
6 FREEMAN D, JAIN S, DURMUTH M, et al. Who Are You? A statistical approach to measuring user authenticity [C] // The Network and Distributed System Security Symposium (NDSS) 2016. San Diego: NDSS, 2016: 1-15.
7 章思宇, 黄保青, 姜开达 统一身份认证日志集中管理与账号风险检测[J]. 东南大学学报: 自然科学版, 2017, 47 (S1): 113- 117
ZHANG Si-yu, HUANG Bao-qing, JIANG Kai-da Unified identity authentication log centralized management and account risk detection[J]. Journal of Southeast University: Natural Science Edition, 2017, 47 (S1): 113- 117
8 陈嵩, 王怡 高校统一身份认证中的账号安全研究[J]. 福建师大福清分校学报, 2017, (4): 100- 105
CHEN Song, WANG Yi Research on account security in university unified identity authentication[J]. Journal of Fujian Normal University Fuqing Branch, 2017, (4): 100- 105
doi: 10.3969/j.issn.1008-3421.2017.04.019
9 聂荣, 余建国, 张洪欣, 等 IP地址地理位置映射技术[J]. 计算机工程, 2008, 34 (15): 102- 104
NIE Rong, YU Jian-guo, ZHANG Hong-xin, et al IP address geolocation mapping technology[J]. Computer Engineering, 2008, 34 (15): 102- 104
doi: 10.3969/j.issn.1000-3428.2008.15.036
10 STAUFFER C, GRIMSON W E L. Adaptive background mixture models for real-time tracking [C] // Proceedings of 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149). Fort Collins: IEEE, 1999: 246-252.
11 岳佳, 王士同 高斯混合模型聚类中EM算法及初始化的研究[J]. 微计算机信息, 2006, (33): 244- 246
YUE Jia, WANG Shi-tong Research on EM algorithm and initialization in Gaussian mixture model clustering[J]. Microcomputer information, 2006, (33): 244- 246
doi: 10.3969/j.issn.1008-0570.2006.33.086
12 王源, 陈亚军 基于高斯混合模型的EM学习算法[J]. 山西师范大学学报: 自然科学版, 2005, 19 (1): 46- 49
WANG Yuan, CHEN Ya-jun EM learning algorithm based on Gaussian mixture model[J]. Journal of Shanxi Normal University: Natural Science Edition, 2005, 19 (1): 46- 49
13 武光达, 蒋朝惠 基于 DPI 的流量识别系统的研究[J]. 信息网络安全, 2014, 14 (10): 44- 48
WU Guang-da, JIANG Zhao-hui Research on DPI-based traffic identification system[J]. Information Network Security, 2014, 14 (10): 44- 48
doi: 10.3969/j.issn.1671-1122.2014.10.008
14 马宏伟, 张光卫, 李鹏 协同过滤推荐算法综述[J]. 小型微型计算机系统, 2009, 30 (7): 1282- 1288
MA Hong-wei, ZHANG Guang-wei, LI Peng A survey of collaborative filtering recommendation algorithms[J]. Small Microcomputer System, 2009, 30 (7): 1282- 1288
15 邢春晓, 高凤荣, 战思南, 等 适应用户兴趣变化的协同过滤推荐算法[J]. 计算机研究与发展, 2007, 44 (2): 296- 301
XING Chun-xiao, GAO Feng-rong, ZHAN Si-nan, et al Collaborative filtering recommendation algorithm adapted to changes in user interest[J]. Computer Research and Development, 2007, 44 (2): 296- 301
doi: 10.1360/crad20070216
16 DEHAK N, DEHAK R, GLASS J R, et al. Cosine similarity scoring without score normalization techniques [C] // The Speaker and Language Recognition Workshop (Odyssey 2010). Brno: IEEE, 2010: 71-75.
17 梁天一, 梁永全, 樊健聪, 等 基于用户兴趣模型的协同过滤推荐算法[J]. 计算机应用与软件, 2014, 31 (11): 260- 263
LIANG Tian-yi, LIANG Yong-quan, FAN Jian-cong, et al Collaborative filtering recommendation algorithm based on user interest model[J]. Computer Applications and Software, 2014, 31 (11): 260- 263
doi: 10.3969/j.issn.1000-386x.2014.11.066
18 JAIN A, NANDAKUMAR K, ROSS A Score normalization in multimodal biometric systems[J]. Pattern Recognition, 2005, 38 (12): 2270- 2285
doi: 10.1016/j.patcog.2005.01.012
19 孙德山 支持向量机分类与回归方法研究[J]. 中南大学学报, 2004, 35 (6): 13- 15
SUN De-shan Research on support vector machine classification and regression method[J]. Journal of Central South University, 2004, 35 (6): 13- 15
[1] 王睿, 李延来, 朱江洪, 杨艺. 考虑专家共识的改进FMEA风险评估方法[J]. 浙江大学学报(工学版), 2018, 52(6): 1058-1067.
[2] 徐程, 曲昭伟, 王殿海, 金盛. 混合自行车交通流速度分布模型[J]. 浙江大学学报(工学版), 2017, 51(7): 1331-1338.
[3] 赵学武, 冀俊忠, 姚垚. 基于免疫克隆选择算法搜索GMM的脑岛功能划分[J]. 浙江大学学报(工学版), 2017, 51(12): 2320-2331.
[4] 贾驰千, 冯冬芹. 基于模糊层次分析法的工控系统安全评估[J]. 浙江大学学报(工学版), 2016, 50(4): 759-765.
[5] 梁耀,冯冬芹. 基于攻击增益的工业控制系统物理层安全风险评估[J]. 浙江大学学报(工学版), 2016, 50(3): 589-.
[6] 卢颖,郭良杰,侯云玥,赵云胜,陈连进. 多灾种耦合综合风险评估方法在城市用地规划中的应用[J]. 浙江大学学报(工学版), 2015, 49(3): 538-546.
[7] 刘扬,张海萍,邓扬,李明. 考虑车辆超载的公路简支梁桥疲劳性能[J]. 浙江大学学报(工学版), 2015, 49(11): 2172-2178.
[8] 张兴友, 王守相. 配电系统与通信相关的风险评估[J]. 浙江大学学报(工学版), 2014, 48(4): 568-574.
[9] 王晓暾,熊伟. 基于DLOWG算子的FMEA风险评估方法[J]. J4, 2012, 46(1): 182-188.
[10] 刘森森 陈为化 江全元. 基于并行计算的电力系统风险评估[J]. J4, 2009, 43(3): 589-595.
[11] 陈为化 江全元 曹一家. 基于模糊神经网络的电力系统连锁故障风险评估[J]. J4, 2007, 41(6): 973-979.