|
|
CQPM based OLAP query log mining and recommendation |
YIN Ting1, XIAO Min1, CHEN Ling1, ZHAO Jiang-qi2, WANG Jing-chang2 |
1.College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China;
2.Zhejiang Hongcheng Computer Systems Company Limited, Hangzhou 310009, China |
|
|
Abstract In order to improve the efficiency of use, a continuous query pattern mining (CQPM) based online analytical processing (OLAP) log mining and recommendation method was proposed. CQPM was based on a closed sequential pattern mining algorithm ,called BI-directional extension (BIDE), which added interval constraint between queries to ensure their continuity. A query suffix tree based approximate query pattern matching (AQPM) algorithm was also developed to predict the next effective query of users, and the prediction result, ranked by the magnitude of probabilities, was exploited to do recommendation for users. The performance of the proposed query recommendation method was evaluated with the query logs of 8 OLAP analysts, which were recorded by Mondrian OLAP server. The results show that compared to prefixspan based algorithm, CQPM is able to get rid of lots of redundant query patterns. Compared to the basic prefix matching method, AQPM increases the accuracy of recommendation.
|
Published: 11 December 2012
|
|
基于CQPM的OLAP查询日志挖掘及推荐
为提高用户的使用效率,提出基于连续查询模式挖掘(CQPM)算法的联机分析处理(OLAP)查询日志挖掘及推荐方法.CQPM算法在双向扩展频繁闭合序列模式挖掘算法(BIDE)的基础上加入查询之间的间隔约束,确保查询模式的连续性.提出方法通过基于查询后缀树的模糊查询模式匹配(AQPM)算法预测用户下一步有效查询,并将预测结果按概率大小排序后推荐给用户.通过8名OLAP分析人员在Mondrian OLAP服务器上的查询日志对提出方法进行性能评价,结果表明,相较基于prefixspan的改进算法,采用CQPM算法能够去除数量庞大的冗余的查询模式,相较基本的前缀匹配算法,AQPM算法能够提高推荐的准确率.
|
|
[1] HAN J , KAMBER M. Data mining: concepts and techniques [M]. San Francisco, CA: Morgan Kaufmann, 2006.
[2] RESNICK P, IACOVOU N, SUSHAK M, et al. GroupLens an open architecture for collaborative filtering of netnews [C] ∥ Proceedings of ACM CSCW 1994. Chapel Hill, North Carolina: ACM New York, 1994: 175-186.
[3] HILL W, STEAD L, ROSENSTEIN M, et al. Recommending and evaluating choices in a virtual community of use [C] ∥ Proceedings of SIGCHI 1995. Denver, Colorado: ACM New York, 1995: 194-201.
[4] SHARDANAND U, MAES P. Social information filtering: algorithms for automating “word of mouth” [C] ∥ Proceedings of SIGCHI 1995. Denver, Colorado: ACM New York, 1995: 210-217.
[5] BAEZAYATES R, RIBEIRONETO B. Modern information retrieval [M]. New York: ACM Press, 1999.
[6] MURTHI B P S, SARKAR S. The role of the management sciences in research on personalization [J]. Management Science, 2003, 49(10): 1344-1362.
[7] HAUSER W J. Marketing analytics: the evolution of marketing research in the twentyfirst century [J]. Direct Marketing: An International Journal, 2007, 1(1): 38-54.
[8] SRIVASTAVA J, COOLEY R, DESHPANDE M, et al. Web usage mining: discovery and applications of usage patterns from web data [J]. SIGKDD Explorer Newsletter, 2000, 1(2): 12-23.
[9] GIACOMETTI A, MARCEL P, NEGRE E. A framework for recommending OLAP queries[C] ∥ Proceedings of ACM DOLAP 2008. Napa Valley, California: ACM New York, 2008: 73-80.
[10] GIACOMETTI A, MARCEL P, NEGRE E. Recommending multidimensional queries[C] ∥ Proceedings of DaWaK 2009. Linz, Austria,:Springer Berlin / Heidelberg, 2009:453-466.
[11] 陈元中. 基于数据挖掘的OLAP智能查询推荐技术研究[D]. 杭州: 浙江大学 2010.
CHEN Yuanzhong. Data mining based OLAP intelligent query recommendation [D]. Hangzhou: Zhejiang University, 2010.
[12] ZHOU B, JIANG D, PEI J, et al. OLAP on search logs: an infrastructure supporting datadriven applications in search engines[C] ∥ Proceedings of ACM Sigkdd Kdd 2009. Paris, France: ACM New York, 2009: 1395-1404.
[13] WANG J, HAN J. BIDE: Efficient mining of frequent closed sequences[C] ∥ Proceedings of ICDE 2004. Boston, MA: IEEE, 2004. 79-90.
[14] CAO H, MAMOULIS N, CHEUNG D W. Mining frequent spatiotemporal sequential patterns[C] ∥ Proceedings of IEEE ICDM 2005. Houston, Texas:IEEE, 2005: 82-89.
[15] Apache logging service log4j[EB/OL]. [20121016]. http:∥logging.apache.org/log4j/1.2/. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|