Please wait a minute...
J4  2012, Vol. 46 Issue (5): 858-865    DOI: 10.3785/j.issn.1008-973X.2012.05.014
自动化技术、电气工程     
一种新的数据流频繁度变化趋势预测算法
郭立超1, 苏宏业1, 缑倩雯2
1. 浙江大学 智能系统与控制研究所, 浙江 杭州 310027;2. 浙江大学 公共管理学院, 浙江 杭州 310027
A new algorithm for frequency tendency prediction over data streams
GUO Li-chao1, SU Hong-ye1, GOU Qian-wen2
1. Institute of Cyber-System and Control, Zhejiang University, Hangzhou 310027, China;
2. College of Public Administration, Zhejiang University, Hangzhou 310027, China
 全文: PDF  HTML
摘要:

针对数据对象在数据流中的频繁度变化趋势的预测问题,提出基于最大最小频率时间窗模型的最大最小频繁趋势预测算法(MM-FTP).设计一种新的最大最小频繁模式树结构(MMFP-Tree),存储数据流概要信息;提出一种新的数据对象频繁度变化趋势衡量指标——频繁度变化率(FCR),定量地对数据对象的频繁度变化趋势进行描述.该算法同样能够对数据流分类置信度变化趋势及传统的指数变化趋势进行有效预测.结果表明,在真实的网络点击数据流上,该算法能够快速准确地预测数据对象的频繁度变化趋势.

Abstract:

For the frequency tendency prediction problem of itemsets over streams, a novel max-min-frequency tendency prediction (MMFTP) algorithm is proposed based on the Max-Min Frequency Window model. A max-min-frequency pattern Tree (MMFP-Tree) structure is established to store the summary information of streams a new measure frequency changing rate (FCR) is presented to describe the frequency tendency of itemsets quantitatively. The MM-FTP algorithm is useful in the index tendency prediction  and the confidence prediction  of classification. Based on the result of the case study on web log data stream, the MM-FTP algorithm could be used to predict the frequency tendency efficiently and effectively.

出版日期: 2012-05-01
:  TP 311.13  
基金资助:

国家“863”高技术研究发展计划资助项目(2008AA042902),国家“973”重点基础研究发展计划资助项目(2007CB714000),浙江省科技型中小企业技术创新资金资助项目(2009D40034).

通讯作者: 苏宏业,男,教授,博导.     E-mail: hysu@iipc.zju.edu.cn
作者简介: 郭立超(1983-),男,博士,从事数据挖掘等方向研究.E-mail:lichao.g@gmail.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

郭立超, 苏宏业, 缑倩雯. 一种新的数据流频繁度变化趋势预测算法[J]. J4, 2012, 46(5): 858-865.

GUO Li-chao, SU Hong-ye, GOU Qian-wen. A new algorithm for frequency tendency prediction over data streams. J4, 2012, 46(5): 858-865.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2012.05.014        http://www.zjujournals.com/eng/CN/Y2012/V46/I5/858

[1] MELEK W W, LU Z, KAPPS A, et al. Comparison of trend detection algorithms in the analysis of physiological timeseries data [J]. IEEE Transactions on Biomedical Engineering, 2005, 52(4): 639-651.
[2] ELFEKY M G, AREF W G, ELMAGARMID A K. Periodicity detection in time series databases [J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(7): 875-887.
[3] ZHU Y, SHASHA D. StatStream: Statistical monitoring of thousands of data streams in real time [C]. In Proceedings of International Conference on Very Large Data Bases. HongKong: VLDB Endowment, 2002: 358-369.
[4] KIFER D, BENDAVID S, GEHRKE J. Detecting change in data streams [C]∥In Proc. of the 30th Int. Conf. on Very Large Data Bases. Toronto: VLDB Endowment, 2004: 180-191.
[5] 宋国杰, 唐世渭, 杨冬青, 等. 数据流中异常模式的提取与趋势监测[J]. 计算机研究与发展, 2004, 41(10): 1754-1759.
SONG Guojie, TANG Shiwei, YANG Dongqing, et al. Extraction and trend detection of unusual patterns over data streams [J]. Journal of Computer Research and Development, 2004, 41(10):1754-1759.
[6] 周黔, 吴铁军. 一种动态数据流的实时趋势分析算法[J]. 控制与决策, 2008, 23(10): 1182-1185, 1191.
ZHOU Qian, WU Tiejun. Realtime algorithm for trend analysis of dynamic data streams [J]. Control and Decision, 2008, 23(10): 1182-1185, 1191.
[7] CALDERS T, DEXTERS N, GOETHALS B, Mining frequent items in a stream using flexible windows [J]. Intelligent Data Analysis, 2008, 12(3): 293-304.
[8] GUO Lichao, SU Hongye, QU Yu. A new algorithm for mining global frequent itemsets in a stream [C]∥ In Proc. of the 6th Int. Conf. on Fuzzy Systems and Knowledge Discovery. Tianjin: IEEE Computer Society, 2009: 232-238.
[9] FRANK A, ASUNCION A. UCI machine learning Repository[EB/OL]. [2010 ]http:∥archive.ics.uci.edu/ml.                                                            

[1] 吴羽,寿黎但,陈刚. CB-LSH:基于压缩位图的高性能LSH索引算法[J]. J4, 2012, 46(3): 377-385.
[2] 洪银杰, 陈刚, 陈珂. 基于分区索引的集合相似连接[J]. J4, 2012, 46(2): 286-293.
[3] 江锦华,吴羽,胡天磊,陈刚. 基于路径连接的XML复杂小枝模式查询处理[J]. J4, 2011, 45(1): 1-8.
[4] 周佳庆, 吴羽, 江锦华, 陈刚,董轶. 实时垂直搜索引擎对象缓存优化策略[J]. J4, 2011, 45(1): 14-19.