Please wait a minute...
J4  2011, Vol. 45 Issue (1): 1-8    DOI: 10.3785/j.issn.1008-973X.2011.01.001
计算机技术﹑电信技术     
基于路径连接的XML复杂小枝模式查询处理
江锦华,吴羽,胡天磊,陈刚
浙江大学 计算机科学与技术学院,浙江 杭州 310027
Efficient processing of complex XML twig pattern queries
based on path-joins
JIANG Jin-hua, WU Yu, HU Tian-lei, CHEN Gang
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
 全文: PDF  HTML
摘要:

针对可扩展标记语言(XML)查询中具有嵌套OR谓词的复杂小枝模式查询处理,提出一种基于路径连接的查询方法.该方法以路径为分解粒度,结合分支扩展(AOBE)的概念,通过路径连接过程实现对复杂小枝模式查询的整体处理.为了进一步提高算法效率,在已有研究的基础上挖掘相应的优化规则,利用索引跳过那些明显不参与连接的元素的访问和计算.与已有算法相比,基于路径连接的查询方法大大简化了复杂小枝模式查询处理过程,只访问查询叶节点对应的元素,可以显著减少结构连接的操作数目和扫描元素的个数.实验结果表明,该方法能够有效地改善复杂小枝模式查询处理的性能.

Abstract:

A novel pathjoins based method was proposed to support efficient processing of complex twig pattern queries with OR-predicates of extensible markup language (XML) queries. The method processed the complex twig pattern matching in a holistic way based on the concept AND/OR branch extension (AOBE) and path-joins by dividing the twig pattern into individual paths. Then an index-based algorithm was proposed to efficiently skip useless elements and avoid unnecessary computations. The path-joins based method simplified the complex twig pattern queries processing compared with the existing algorithms.  The method only accessed the labels of leaf query nodes, thus the I/O and CPU costs were greatly reduced. Experimental results demonstrate that the method is more efficient than previous approaches.

出版日期: 2011-03-03
:  TP 311.13  
基金资助:

 国家自然科学基金资助项目(60603044, 60803003);国家“863”高技术研究发展计划资助项目(2006AA010107);浙江省重大科技专项国际科技合作项目(2008C14060).

通讯作者: 陈刚,男,教授.     E-mail: cg@zju.edu.cn
作者简介: 江锦华(1983-),男,浙江衢州人,博士生,从事XML数据库、垂直搜索引擎等研究.E-mail:jiangjinhua.zju@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

江锦华,吴羽,胡天磊,陈刚. 基于路径连接的XML复杂小枝模式查询处理[J]. J4, 2011, 45(1): 1-8.

JIANG Jin-hua, WU Yu, HU Tian-lei, CHEN Gang. Efficient processing of complex XML twig pattern queries
based on path-joins. J4, 2011, 45(1): 1-8.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2011.01.001        http://www.zjujournals.com/eng/CN/Y2011/V45/I1/1

[1] 孟小峰,周龙骧,王珊. 数据库技术发展趋势[J]. 软件学报, 2004,15(12):1822-1836.
MENG Xiaofeng, ZHOU Longxiang, WANG Shan. State of the art and trends in database research [J]. Journal of Software, 2004, 15(12): 1822-1836.
[2] 孔令波,唐世渭,杨冬青,等.XML 数据的查询技术[J]. 软件学报, 2007,18(6):1400-1418.
KONG Lingbo, TANG Shiwei, YANG Dongqing, et al. Querying techniques for XML data [J]. Journal of Software, 2007, 18(6): 1400-1418.
[3] BERGLUND A, BOAG S, CHAMBERLIN D, et al. XML path language (XPath) 2.0 [R]. Boston: W3C, 2002.
[4] BOAG S, CHAMBERLIN D, FERNANDEZ M F, et al. XQuery 1.0: an XML query language [R]. Boston: W3C, 2002.
[5] BRUNO N, KOUDAS N, SRIVASTAVA D. Holistic twig joins: optimal XML pattern matching [C]∥ Proceedings of ACM SIGMOD International Conference on Management of Data. Madison, Wisconsin: ACM, 2002: 310-321.
[6] JIANG H F, WANG W, LU H J. Holistic twig joins on indexed XML documents [C]∥ Proceeding of the 29th VLDB Conference. Berlin: Morgan Kaufmann, 2003: 273-284.
[7] LU J H, LING T W, CHAN C Y, et al. From region encoding to extended Dewey: on efficient processing of XML twig pattern matching [C]∥ Proceeding of the 31st VLDB Conference. Trondheim: Morgan Kaufmann, 2005: 193-204.
[8] CHEN S, LI H G, TATEMURA J, et al. Twig2Stack: bottomup processing of generalizedtreepattern queries over XML documents [C] ∥ Proceeding of the 32nd VLDB Conference. Seoul: Morgan Kaufmann, 2006: 283-296.
[9] POON C K, YUEN L. Faster twig pattern matching using extended Dewey ID [C]∥ Proceeding of the 17th International Conference on Database and Expert Systems Applications. Krakow: IEEE, 2006: 299-306.
[10] LI G L, FENG J H, ZHANG Y, et al. Efficient holistic twig joins in leaftoroot combining with roottoleaf way [C]∥ Proceeding of the 12nd International Conference on Database Systems for Advanced Applications. Bangkok: Springer, 2007: 834-849.
[11] ALKHALIFA S, JAGADISH H V, KOUDAS N, et al. Structural joins: a primitive for efficient XML query pattern matching [C]∥ Proceedings of the 18th International Conference on Data Engineering. San Jose: IEEE, 2002: 141-152.
[12] JIANG H F, LU H J, WANG W. Efficient processing of XML twig queries with ORpredicates [C]∥ Proceedings of ACM SIGMOD International Conference on Management of Data. Pairs: ACM, 2004: 59-70.
[13] 杨卫东,王清明,施伯乐. 针对XML流数据的复杂Twig Pattern查询处理[J]. 软件学报,2007,18(4): 893-904.
YANG Weidong, WANG Qingming, SHI Baile. Complex twig pattern query processing over XML streams [J]. Journal of Software, 2007, 18(4): 893-904.
[14] CHE D. Holistically processing XML twig queries with AND, OR, and NOT predicates [C]∥ Proceeding of the 2nd International Conference on Scalable Information Systems. Suzhou: ACM, 2007: 53-56.

[1] 郭立超, 苏宏业, 缑倩雯. 一种新的数据流频繁度变化趋势预测算法[J]. J4, 2012, 46(5): 858-865.
[2] 吴羽,寿黎但,陈刚. CB-LSH:基于压缩位图的高性能LSH索引算法[J]. J4, 2012, 46(3): 377-385.
[3] 洪银杰, 陈刚, 陈珂. 基于分区索引的集合相似连接[J]. J4, 2012, 46(2): 286-293.
[4] 周佳庆, 吴羽, 江锦华, 陈刚,董轶. 实时垂直搜索引擎对象缓存优化策略[J]. J4, 2011, 45(1): 14-19.