Please wait a minute...
J4  2012, Vol. 46 Issue (11): 2052-2060    DOI: 10.3785/j.issn.1008-973X.2012.11.017
    
CQPM based OLAP query log mining and recommendation
YIN Ting1, XIAO Min1, CHEN Ling1, ZHAO Jiang-qi2, WANG Jing-chang2
1.College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China;
2.Zhejiang Hongcheng Computer Systems Company Limited, Hangzhou 310009, China
Download:   PDF(0KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

In order to improve the efficiency of use, a continuous query pattern mining (CQPM) based online analytical processing (OLAP) log mining and recommendation method was proposed. CQPM was based on a closed sequential pattern mining algorithm ,called BI-directional extension (BIDE), which added interval constraint between queries to ensure their continuity. A query suffix tree based approximate query pattern matching (AQPM) algorithm was also developed to predict the next effective query of users, and the prediction result, ranked by the magnitude of probabilities, was exploited to do recommendation for users. The performance of the proposed query recommendation method was evaluated with the query logs of 8 OLAP analysts, which were recorded by Mondrian OLAP server. The results show that compared to prefixspan based algorithm, CQPM is able to get rid of lots of redundant query patterns. Compared to the basic prefix matching method, AQPM increases the accuracy of recommendation.



Published: 11 December 2012
CLC:  TP 311  
Cite this article:

YIN Ting, XIAO Min, CHEN Ling, ZHAO Jiang-qi, WANG Jing-chang. CQPM based OLAP query log mining and recommendation. J4, 2012, 46(11): 2052-2060.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2012.11.017     OR     http://www.zjujournals.com/eng/Y2012/V46/I11/2052


基于CQPM的OLAP查询日志挖掘及推荐

为提高用户的使用效率,提出基于连续查询模式挖掘(CQPM)算法的联机分析处理(OLAP)查询日志挖掘及推荐方法.CQPM算法在双向扩展频繁闭合序列模式挖掘算法(BIDE)的基础上加入查询之间的间隔约束,确保查询模式的连续性.提出方法通过基于查询后缀树的模糊查询模式匹配(AQPM)算法预测用户下一步有效查询,并将预测结果按概率大小排序后推荐给用户.通过8名OLAP分析人员在Mondrian OLAP服务器上的查询日志对提出方法进行性能评价,结果表明,相较基于prefixspan的改进算法,采用CQPM算法能够去除数量庞大的冗余的查询模式,相较基本的前缀匹配算法,AQPM算法能够提高推荐的准确率.

[1] HAN J , KAMBER M. Data mining: concepts and techniques [M]. San Francisco, CA: Morgan Kaufmann, 2006.
[2] RESNICK P, IACOVOU N, SUSHAK M, et al. GroupLens an open architecture for collaborative filtering of netnews [C] ∥ Proceedings of ACM CSCW 1994. Chapel Hill, North Carolina: ACM New York, 1994: 175-186.
[3] HILL W, STEAD L, ROSENSTEIN M, et al. Recommending and evaluating choices in a virtual community of use [C] ∥ Proceedings of SIGCHI 1995. Denver, Colorado: ACM New York, 1995: 194-201.
[4] SHARDANAND U, MAES P. Social information filtering: algorithms for automating “word of mouth” [C] ∥ Proceedings of SIGCHI 1995. Denver, Colorado: ACM New York, 1995: 210-217.
[5] BAEZAYATES R, RIBEIRONETO B. Modern information retrieval [M]. New York: ACM Press, 1999.
[6] MURTHI B P S, SARKAR S. The role of the management sciences in research on personalization [J]. Management Science, 2003, 49(10): 1344-1362.
[7] HAUSER W J. Marketing analytics: the evolution of marketing research in the twentyfirst century [J]. Direct Marketing: An International Journal, 2007, 1(1): 38-54.
[8] SRIVASTAVA J, COOLEY R, DESHPANDE M, et al. Web usage mining: discovery and applications of usage patterns from web data [J]. SIGKDD Explorer Newsletter, 2000, 1(2): 12-23.
[9] GIACOMETTI A, MARCEL P, NEGRE E. A framework for recommending OLAP queries[C] ∥ Proceedings of ACM DOLAP 2008. Napa Valley, California: ACM New York, 2008: 73-80.

[10] GIACOMETTI A, MARCEL P, NEGRE E. Recommending multidimensional queries[C] ∥ Proceedings of DaWaK 2009. Linz, Austria,:Springer Berlin / Heidelberg, 2009:453-466.
[11] 陈元中. 基于数据挖掘的OLAP智能查询推荐技术研究[D]. 杭州: 浙江大学 2010.
CHEN Yuanzhong. Data mining based OLAP intelligent query recommendation [D]. Hangzhou: Zhejiang University, 2010.
[12] ZHOU B, JIANG D, PEI J, et al. OLAP on search logs: an infrastructure supporting datadriven applications in search engines[C] ∥ Proceedings of ACM Sigkdd Kdd 2009. Paris, France: ACM New York, 2009: 1395-1404.
[13] WANG J, HAN J. BIDE: Efficient mining of frequent closed sequences[C] ∥ Proceedings of ICDE 2004. Boston, MA: IEEE, 2004. 79-90.
[14] CAO H, MAMOULIS N, CHEUNG D W. Mining frequent spatiotemporal sequential patterns[C] ∥ Proceedings of IEEE ICDM 2005. Houston, Texas:IEEE, 2005: 82-89.
[15] Apache logging service log4j[EB/OL]. [20121016]. http:∥logging.apache.org/log4j/1.2/.

[1] KE Hai-feng, YING Jing. Real-time license character recognition technology based on R-ELM[J]. J4, 2014, 48(2): 0-0.
[2] JIN Cang-hong, WU Ming-hui, YING Jing. A context-aware index based text extraction framework[J]. J4, 2013, 47(9): 1537-1546.
[3] ZHU Fan-wei, WU Ming-hui, YING Jing. Faceted Web search approach for large scale unstructured data[J]. J4, 2013, 47(6): 990-999.
[4] FENG Pei-en, LIU Yu, QIU Qing-ying, LI Li-xin. Strategies of efficiency improvement for Eclat algorithm[J]. J4, 2013, 47(2): 223-230.
[5] LIU Ying, CHEN Ling, CHEN Gen-cai, ZHAO Jiang-qi, WANG Jing-chang. Approach for collection selection based on click-through data[J]. J4, 2013, 47(1): 23-28.
[6] XIAO Min, CHEN Iing, XIA Hai-yuan, CHEN Gen-cai. Data warehouse native feature based OLAP querying with keywords[J]. J4, 2012, 46(6): 974-979.
[7] ZHANG Li-ping, LI Song, HAO Xiao-hong, HAO Zhong-xiao. Jrv  rough Vague region relation[J]. J4, 2012, 46(1): 105-111.
[8] CHEN Ling, XU Xiao-long, YANG Qing, CHEN Gen-cai. Wireless signal strength propagation model
 base on cubic spline interpolation
[J]. J4, 2011, 45(9): 1521-1527.
[9] WU Ming-hui, YING Jing. Business process modeling and formal verification[J]. J4, 2011, 45(2): 280-287.
[10] FU Chao-yang, GAO Ji, ZHOU You-ming. Service discovery based on integrating lexical multi-level hashing
with subsumption semantics
[J]. J4, 2010, 44(12): 2274-2283.
[11] YANG Qing, CHEN Ling, CHEN Gen-Cai. Estimating walking distance based on single accelerometer[J]. J4, 2010, 44(9): 1681-1686.
[12] XIONG Wei, WANG Xiao-Tun. Method for mapping software dependability requirements
based on quality function deployment
[J]. J4, 2010, 44(5): 881-886.
[13] ZHANG Yin, HE Gao, DIAO Li-Na, ZHANG San-Yuan. Abstract state machine design of Internetware model[J]. J4, 2010, 44(5): 923-929.
[14] JIANG Chao, YING Jing, TUN Meng-Hui, et al. Feature increment oriented  approach for software product line analysis[J]. J4, 2009, 43(12): 2142-2148.
[15] CHEN Bin, TAO Min. Mining associated and item-item correlated frequent patterns[J]. J4, 2009, 43(12): 2171-2177.