计算机技术 |
|
|
|
|
基于分区索引的集合相似连接 |
洪银杰, 陈刚, 陈珂 |
浙江大学 计算机科学与技术系,浙江 杭州 310027 |
|
Set similarity join using partition index |
HONG Yin-jie, CHEN Gang, CHEN Ke |
Department of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China |
[1] XIAO Chuan, WANG Wei, LIN Xuemin, et al. Efficient similarity joins for near duplicate detection [C]∥ Proceedings of the 17th International Conference on World Wide Web. Beijing: ACM, 2008: 131-140.
[2] ARASU A, GANTI V, KAUSHIK R. Efficient exact setsimilarity joins [C]∥ Proceedings of the 32nd International Conference on Very Large Data Bases. Seoul: ACM, 2006: 918-929.
[3] AGRAWAL P, ARASU A, KAUSHIK R. On indexing errortolerant set containment [C]∥ Proceedings of the ACM SIGMOD International Conference on Management of Data. Indianapolis: ACM, 2010: 927-938.
[4] THEOBALD M, SIDDHARTH J, PAEPCKE A. Spotsigs: robust and efficient near duplicate detection in large web collections [C]∥ Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Singapore: ACM, 2008: 563-570.
[5] CHAUDHURI S, GANTI V, KAUSHIK R. A primitive operator for similarity joins in data cleaning [C]∥ Proceedings of the 22nd International Conference on Data Engineering. Atlanta: IEEE Computer Society, 2006: 5.
[6] SARAWAGI S, KIRPAL A. Efficient set joins on similarity predicates [C]∥ Proceedings of the ACM SIGMOD International Conference on Management of Data. Paris: ACM, 2004: 743-754.
[7] GRAVANO L, IPEIROTIS P G, JAGADISH H V, et al. Approximate string joins in a database (almost) for free [C]∥ Proceedings of 27th International Conference on Very Large Data Bases. Roma. Morgan Kaufmann, 2001: 491-500.
[8] XIAO Chuan, WANG Wei, LIN Xuemin. Edjoin: an efficient algorithm for similarity joins with edit distance constraints [J]. PVLDB, 2008(1): 933-944.
[9] RIBEIRO L, HRDER T. Efficient set similarity joins using minprexes [C]∥ Advances in Databases and Information Systems, 13th East European Conference. Riga: Springer, 2009: 88-102.
[10] BAYARDO R J, MA Y, SRIKANT R. Scaling up all pairs similarity search [C]∥ Proceedings of the 16th International Conference on World Wide Web. Alberta: ACM, 2007: 131-140.
[11] MAMOULIS N. Efficient processing of joins on setvalued attributes [C]∥ Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. California: ACM, 2003: 157-168. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|