Please wait a minute...
J4  2011, Vol. 45 Issue (1): 1-8    DOI: 10.3785/j.issn.1008-973X.2011.01.001
    
Efficient processing of complex XML twig pattern queries
based on path-joins
JIANG Jin-hua, WU Yu, HU Tian-lei, CHEN Gang
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Download:   PDF(0KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A novel pathjoins based method was proposed to support efficient processing of complex twig pattern queries with OR-predicates of extensible markup language (XML) queries. The method processed the complex twig pattern matching in a holistic way based on the concept AND/OR branch extension (AOBE) and path-joins by dividing the twig pattern into individual paths. Then an index-based algorithm was proposed to efficiently skip useless elements and avoid unnecessary computations. The path-joins based method simplified the complex twig pattern queries processing compared with the existing algorithms.  The method only accessed the labels of leaf query nodes, thus the I/O and CPU costs were greatly reduced. Experimental results demonstrate that the method is more efficient than previous approaches.



Published: 03 March 2011
CLC:  TP 311.13  
Cite this article:

JIANG Jin-hua, WU Yu, HU Tian-lei, CHEN Gang. Efficient processing of complex XML twig pattern queries
based on path-joins. J4, 2011, 45(1): 1-8.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2011.01.001     OR     http://www.zjujournals.com/eng/Y2011/V45/I1/1


基于路径连接的XML复杂小枝模式查询处理

针对可扩展标记语言(XML)查询中具有嵌套OR谓词的复杂小枝模式查询处理,提出一种基于路径连接的查询方法.该方法以路径为分解粒度,结合分支扩展(AOBE)的概念,通过路径连接过程实现对复杂小枝模式查询的整体处理.为了进一步提高算法效率,在已有研究的基础上挖掘相应的优化规则,利用索引跳过那些明显不参与连接的元素的访问和计算.与已有算法相比,基于路径连接的查询方法大大简化了复杂小枝模式查询处理过程,只访问查询叶节点对应的元素,可以显著减少结构连接的操作数目和扫描元素的个数.实验结果表明,该方法能够有效地改善复杂小枝模式查询处理的性能.

[1] 孟小峰,周龙骧,王珊. 数据库技术发展趋势[J]. 软件学报, 2004,15(12):1822-1836.
MENG Xiaofeng, ZHOU Longxiang, WANG Shan. State of the art and trends in database research [J]. Journal of Software, 2004, 15(12): 1822-1836.
[2] 孔令波,唐世渭,杨冬青,等.XML 数据的查询技术[J]. 软件学报, 2007,18(6):1400-1418.
KONG Lingbo, TANG Shiwei, YANG Dongqing, et al. Querying techniques for XML data [J]. Journal of Software, 2007, 18(6): 1400-1418.
[3] BERGLUND A, BOAG S, CHAMBERLIN D, et al. XML path language (XPath) 2.0 [R]. Boston: W3C, 2002.
[4] BOAG S, CHAMBERLIN D, FERNANDEZ M F, et al. XQuery 1.0: an XML query language [R]. Boston: W3C, 2002.
[5] BRUNO N, KOUDAS N, SRIVASTAVA D. Holistic twig joins: optimal XML pattern matching [C]∥ Proceedings of ACM SIGMOD International Conference on Management of Data. Madison, Wisconsin: ACM, 2002: 310-321.
[6] JIANG H F, WANG W, LU H J. Holistic twig joins on indexed XML documents [C]∥ Proceeding of the 29th VLDB Conference. Berlin: Morgan Kaufmann, 2003: 273-284.
[7] LU J H, LING T W, CHAN C Y, et al. From region encoding to extended Dewey: on efficient processing of XML twig pattern matching [C]∥ Proceeding of the 31st VLDB Conference. Trondheim: Morgan Kaufmann, 2005: 193-204.
[8] CHEN S, LI H G, TATEMURA J, et al. Twig2Stack: bottomup processing of generalizedtreepattern queries over XML documents [C] ∥ Proceeding of the 32nd VLDB Conference. Seoul: Morgan Kaufmann, 2006: 283-296.
[9] POON C K, YUEN L. Faster twig pattern matching using extended Dewey ID [C]∥ Proceeding of the 17th International Conference on Database and Expert Systems Applications. Krakow: IEEE, 2006: 299-306.
[10] LI G L, FENG J H, ZHANG Y, et al. Efficient holistic twig joins in leaftoroot combining with roottoleaf way [C]∥ Proceeding of the 12nd International Conference on Database Systems for Advanced Applications. Bangkok: Springer, 2007: 834-849.
[11] ALKHALIFA S, JAGADISH H V, KOUDAS N, et al. Structural joins: a primitive for efficient XML query pattern matching [C]∥ Proceedings of the 18th International Conference on Data Engineering. San Jose: IEEE, 2002: 141-152.
[12] JIANG H F, LU H J, WANG W. Efficient processing of XML twig queries with ORpredicates [C]∥ Proceedings of ACM SIGMOD International Conference on Management of Data. Pairs: ACM, 2004: 59-70.
[13] 杨卫东,王清明,施伯乐. 针对XML流数据的复杂Twig Pattern查询处理[J]. 软件学报,2007,18(4): 893-904.
YANG Weidong, WANG Qingming, SHI Baile. Complex twig pattern query processing over XML streams [J]. Journal of Software, 2007, 18(4): 893-904.
[14] CHE D. Holistically processing XML twig queries with AND, OR, and NOT predicates [C]∥ Proceeding of the 2nd International Conference on Scalable Information Systems. Suzhou: ACM, 2007: 53-56.

[1] GUO Li-chao, SU Hong-ye, GOU Qian-wen. A new algorithm for frequency tendency prediction over data streams[J]. J4, 2012, 46(5): 858-865.
[2] WU Yu, SHOU Li-dan, CHEN Gang. CB-LSH: an efficient LSH indexing algorithm based on compressed bitmap[J]. J4, 2012, 46(3): 377-385.
[3] HONG Yin-jie, CHEN Gang, CHEN Ke. Set similarity join using partition index[J]. J4, 2012, 46(2): 286-293.
[4] ZHOU Jia-qing, WU Yu, JIANG Jin-hua, CHEN Gang, DONG Yi. Object cache optimization strategy for real-time vertical search engine[J]. J4, 2011, 45(1): 14-19.