|
|
Network traffic anomaly detection based on feature-based symbolic representation |
Peng ZHAN1,2( ),Lin CHEN1,2,*( ),Lu-hui CAO2,Xue-qing LI1 |
1. School of Software, Shandong University, Jinan 250100, China 2. Informatization Office, Shandong University, Jinan 250100, China |
|
|
Abstract A network traffic anomaly detection algorithm based on feature-based symbolic representation (NAAD-FD) was proposed in order to accurately detect network traffic anomaly and guarantee network quality. The network traffic data were transformed into feature-based symbolic representation by segmenting data series according to network traffic turning points. Then the seven characteristic values of each subsequence were extracted, which can be used in the proposed distance measure. The network traffic anomaly sequences were detected with density-based algorithm according to the network traffic anomaly definition based on time series. The experimental results for algorithm parameters, simulation data and real network traffic data anomaly detection demonstrate that the proposed algorithm has strong robustness. The validity and stability of the algorithm were verified. The time complexity of the algorithm is significantly reduced by the proposed feature-based symbolic representation, which can accelerate the process of network traffic anomaly detection by around 40%.
|
Received: 19 September 2019
Published: 05 July 2020
|
|
Corresponding Authors:
Lin CHEN
E-mail: zhanpeng@sdu.edu.cn;chenlin@sdu.edu.cn
|
基于特征符号表示的网络异常流量检测算法
为了准确检测网络中的流量异常情况,确保网络正常运行,提出基于特征符号表示的网络异常流量检测算法(NAAD-FD). NAAD-FD算法利用趋势转折点将网络流量数据按照基于趋势特征的符号表示方法进行转化,按照表示结果将原始数据转化为包含7项特征值的子序列,将7项特征值运用到提出的距离计算方法中;结合基于密度的算法,按照时间序列的网络异常流量定义执行异常检测. 通过对算法参数、仿真数据和真实网络流量数据的实验与分析可知,该算法具有较强的鲁棒性,验证了该算法的有效性和稳定性. 该算法通过降维简化表示,显著降低了算法的时间复杂度,有效加速异常检测过程约40%.
关键词:
网络流量异常,
时间序列,
趋势特征,
符号近似,
转折点
|
|
[1] |
ATKINSON A C, HAWKINS D M Identification of outliers[J]. Biometrics, 1981, 37 (4): 860
|
|
|
[2] |
BILLOR N, HADI A S, VELLEMAN P F BACON: blocked adaptive computationally efficient outlier nominators[J]. Computational Statistics and Data Analysis, 2000, 34 (3): 279- 298
doi: 10.1016/S0167-9473(99)00101-2
|
|
|
[3] |
KNORR E M, NG R T. A unified notion of outliers: properties and computation [C]//International Conference on Knowledge Discovery and Data Mining. California: AAAI, 1997: 219-222.
|
|
|
[4] |
GUAN H, LI Q, YAN Z, et al. SLOF: identify density-based local outliers in big data [C]//Web Information System and Application Conference. Jinan: IEEE, 2015.
|
|
|
[5] |
MARKOU M, SINGH S Novelty detection: a review—part 2: neural network based approaches[J]. Signal Processing, 2003, 83 (12): 2499- 2521
doi: 10.1016/j.sigpro.2003.07.019
|
|
|
[6] |
WANG J S, CHIANG J C A cluster validity measure with outlier detection for support vector clustering[J]. IEEE Transactions on Cybernetics, 2008, 38 (1): 78- 89
|
|
|
[7] |
KEOGH E, LIN J, FU A. HOT SAX: efficiently finding the most unusual time series subsequence [C]//5th IEEE International Conference on Data Mining. Houston: IEEE, 2006.
|
|
|
[8] |
FU W C, LEUNG T W, KEOGH E J, et al. Finding time series discords based on Haar transform [C]//Advanced Data Mining and Applications, 2nd International Conference. Xi'an: Springer, 2006.
|
|
|
[9] |
KHANH N D K, ANH D T. Time series discord discovery using WAT algorithm and iSAX representation [C]// Proceedings of the 3rd Symposium on Information and Communication Technology. Ha Long: ACM, 2012: 207–213.
|
|
|
[10] |
SHIEH J, KEOGH E. iSAX: indexing and mining terabyte sized time series [C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas: ACM, 2012: 207–213.
|
|
|
[11] |
孙梅玉 基于距离和密度的时间序列异常检测方法研究[J]. 计算机工程与应用, 2012, 48 (20): 11- 17 SUN Mei-yu Research on discords detect on time series based on distance and density[J]. Computer Engineering and Applications, 2012, 48 (20): 11- 17
doi: 10.3778/j.issn.1002-8331.2012.20.003
|
|
|
[12] |
余宇峰, 朱跃龙, 万定生, 等 基于滑动窗口预测的水文时间序列异常检测[J]. 计算机应用, 2014, 34 (8): 2217- 2220 YU Yu-feng, ZHU Yue-long, WAN Ding-sheng, et al Time series outlier detection based on sliding window prediction[J]. Journal of Computer Applications, 2014, 34 (8): 2217- 2220
doi: 10.11772/j.issn.1001-9081.2014.08.2217
|
|
|
[13] |
周大镯, 刘月芬, 马文秀 时间序列异常检测[J]. 计算机工程与应用, 2008, 44 (35): 145- 147 ZHOU Da-zhuo, LIU Yue-fen, MA Wen-xiu Effective time series outlier detection algorithm based on segmentation[J]. Computer Engineering and Applications, 2008, 44 (35): 145- 147
doi: 10.3778/j.issn.1002-8331.2008.35.044
|
|
|
[14] |
张力生, 杨美洁, 雷大江 时间序列重要点分割的异常子序列检测[J]. 计算机科学, 2012, 39 (5): 183- 186 ZHANG Li-sheng, YANG Mei-jie, LEI Da-jiang Outlier sub-sequences detection for importance points segmentation of time series[J]. Computer Science, 2012, 39 (5): 183- 186
doi: 10.3969/j.issn.1002-137X.2012.05.043
|
|
|
[15] |
BREUNIG M M, KRIEGEL H P, NG R T, et al. LOF: identifying density-based local outliers [C]//Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas: ACM, 2000.
|
|
|
[16] |
KEOGH E, CHU S, HART D, et al. An online algorithm for segmenting time series [C]//Proceedings of 2001 IEEE International Conference on Data Mining. San Jose: IEEE, 2001: 289-296.
|
|
|
[17] |
KEOGH E, CHAKRABARTI K, PAZZANI M, et al Dimensionality reduction for fast similarity search in large time series databases[J]. Knowledge and Information Systems, 2002, 3 (3): 263- 286
|
|
|
[18] |
ZHAN P, HU Y, ZHANG Q, et al. Feature-based dividing symbolic time series representation for streaming data processing [C]//Proceedings of the 9th International Conference on Information Technology in Medicine and Education. Hangzhou: IEEE, 2018: 817-823.
|
|
|
[19] |
ZHAN P, HU Y, LUO W, et al. Feature-based online segmentation algorithm for streaming time series (short paper) [C]// Proceedings of the 14th EAI International Conference CollaborateCom. Shanghai: Springer, 2018: 477-487.
|
|
|
[20] |
YIN J, SI Y W, GONG Z. Financial time series segmentation based on turning points [C]//Proceedings of 2011 International Conference on System Science and Engineering. Macao: IEEE, 2011: 394-399.
|
|
|
[21] |
SUN Y, LI J, LIU J, et al An improvement of symbolic aggregate approximation distance measure for time series[J]. Neurocomputing, 2014, 138: 189- 198
doi: 10.1016/j.neucom.2014.01.045
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|