A new algorithm for frequency tendency prediction over data streams
GUO Li-chao1, SU Hong-ye1, GOU Qian-wen2
1. Institute of Cyber-System and Control, Zhejiang University, Hangzhou 310027, China;
2. College of Public Administration, Zhejiang University, Hangzhou 310027, China
For the frequency tendency prediction problem of itemsets over streams, a novel max-min-frequency tendency prediction (MMFTP) algorithm is proposed based on the Max-Min Frequency Window model. A max-min-frequency pattern Tree (MMFP-Tree) structure is established to store the summary information of streams a new measure frequency changing rate (FCR) is presented to describe the frequency tendency of itemsets quantitatively. The MM-FTP algorithm is useful in the index tendency prediction and the confidence prediction of classification. Based on the result of the case study on web log data stream, the MM-FTP algorithm could be used to predict the frequency tendency efficiently and effectively.
[1] MELEK W W, LU Z, KAPPS A, et al. Comparison of trend detection algorithms in the analysis of physiological timeseries data [J]. IEEE Transactions on Biomedical Engineering, 2005, 52(4): 639-651.
[2] ELFEKY M G, AREF W G, ELMAGARMID A K. Periodicity detection in time series databases [J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(7): 875-887.
[3] ZHU Y, SHASHA D. StatStream: Statistical monitoring of thousands of data streams in real time [C]. In Proceedings of International Conference on Very Large Data Bases. HongKong: VLDB Endowment, 2002: 358-369.
[4] KIFER D, BENDAVID S, GEHRKE J. Detecting change in data streams [C]∥In Proc. of the 30th Int. Conf. on Very Large Data Bases. Toronto: VLDB Endowment, 2004: 180-191.
[5] 宋国杰, 唐世渭, 杨冬青, 等. 数据流中异常模式的提取与趋势监测[J]. 计算机研究与发展, 2004, 41(10): 1754-1759.
SONG Guojie, TANG Shiwei, YANG Dongqing, et al. Extraction and trend detection of unusual patterns over data streams [J]. Journal of Computer Research and Development, 2004, 41(10):1754-1759.
[6] 周黔, 吴铁军. 一种动态数据流的实时趋势分析算法[J]. 控制与决策, 2008, 23(10): 1182-1185, 1191.
ZHOU Qian, WU Tiejun. Realtime algorithm for trend analysis of dynamic data streams [J]. Control and Decision, 2008, 23(10): 1182-1185, 1191.
[7] CALDERS T, DEXTERS N, GOETHALS B, Mining frequent items in a stream using flexible windows [J]. Intelligent Data Analysis, 2008, 12(3): 293-304.
[8] GUO Lichao, SU Hongye, QU Yu. A new algorithm for mining global frequent itemsets in a stream [C]∥ In Proc. of the 6th Int. Conf. on Fuzzy Systems and Knowledge Discovery. Tianjin: IEEE Computer Society, 2009: 232-238.
[9] FRANK A, ASUNCION A. UCI machine learning Repository[EB/OL]. [2010 ]http:∥archive.ics.uci.edu/ml.