|
|
|
| Entity alignment method based on embedding features and sparse matrices |
Chaowen FENG1,2( ),Chengchen GENG1,2,Yingli LIU1,2,*( ) |
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China 2. Yunnan Key Laboratory of Computer Technology Applications, Kunming University of Science and Technology, Kunming 650500, China |
|
|
|
Abstract Entity alignment for multilingual knowledge fusion suffers from insufficient granularity in feature modeling and limited exploitation of structural information. An entity alignment method was proposed that integrated multi-level embedding features with a sparse matrix propagation mechanism. Entities were represented through a unified embedding that fused character-level features, word-level embeddings, and neighborhood relational information, enabling fine-grained semantic and structural expression. To promote efficient knowledge propagation, a sparse adjacency matrix was constructed based on relation embeddings, and a normalization-based mechanism was introduced to stabilize feature transmission across graphs. To enhance global consistency during alignment, Sinkhorn regularization was applied to refine the similarity matrix, followed by the Hungarian algorithm to obtain optimal one-to-one matching. Stable performance was achieved on multiple cross-lingual knowledge graph datasets in terms of evaluation metrics such as hit rate and mean reciprocal rank. Compared with representative methods such as SNGA and EAMI, the proposed approach demonstrated strong competitiveness, validating its accuracy and robustness.
|
|
Received: 06 March 2025
Published: 03 February 2026
|
|
|
| Fund: 国家自然科学基金资助项目(52061020);云南省重大科技专项计划项目(202302AG050009);云南省计算机技术应用重点实验室开放基金资助项目(2024G05). |
|
Corresponding Authors:
Yingli LIU
E-mail: 15236085295@163.com;lyl@kust.edu.cn
|
基于嵌入特征和稀疏矩阵的实体对齐方法
多语言知识融合的实体对齐面临特征建模粒度不足、结构信息利用受限的挑战,为此提出融合多层次嵌入特征与稀疏矩阵传播机制的实体对齐方法. 结合字符特征、词向量特征与邻域关系特征,构建统一的多维实体表示,增强实体的局部语义表达和结构关联建模能力. 基于关系嵌入构建稀疏邻接矩阵,结合特征归一化传播机制,实现信息在知识图谱中的稳定扩展与有效传递. 为了进一步提升实体匹配的全局一致性,引入Sinkhorn正则化优化相似度矩阵,采用Hungarian算法执行最优实体对齐. 所提方法在多个跨语言知识图谱数据集上的命中率和平均倒数排名评价指标上均有稳定性能表现,比代表性方法(如SNGA、EAMI)的竞争性强. 该结果有效验证了所提方法的准确性与鲁棒性.
关键词:
知识图谱,
实体对齐,
多层次特征建模,
稀疏矩阵传播,
Sinkhorn正则化
|
|
| [1] |
CHEN X, JIA S, XIANG Y A review: knowledge reasoning over knowledge graph[J]. Expert Systems with Applications, 2020, 141: 112948
doi: 10.1016/j.eswa.2019.112948
|
|
|
| [2] |
ZHAO X, ZENG W, TANG J, et al An experimental study of state-of-the-art entity alignment approaches[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34 (6): 2610- 2625
|
|
|
| [3] |
FU T C, CHUNG F L, LUK R, et al Stock time series pattern matching: template-based vs. rule-based approaches[J]. Engineering Applications of Artificial Intelligence, 2007, 20 (3): 347- 364
doi: 10.1016/j.engappai.2006.07.003
|
|
|
| [4] |
CHANDRASEKARAN D, MAGO V Evolution of semantic similarity: a survey[J]. ACM Computing Surveys, 2021, 54 (2): 1- 37
|
|
|
| [5] |
HERRMANN L, KOLLMANNSBERGER S Deep learning in computational mechanics: a review[J]. Computational Mechanics, 2024, 74 (2): 281- 331
doi: 10.1007/s00466-023-02434-4
|
|
|
| [6] |
CORSO G, STARK H, JEGELKA S, et al Graph neural networks[J]. Nature Reviews Methods Primers, 2024, 4: 17
doi: 10.1038/s43586-024-00294-7
|
|
|
| [7] |
QAISER S, ALI R Text mining: use of TF-IDF to examine the relevance of words to documents[J]. International Journal of Computer Applications, 2018, 181 (1): 25- 29
doi: 10.5120/ijca2018917395
|
|
|
| [8] |
COHEN W W, RICHMAN J. Learning to match and cluster large high-dimensional data sets for data integration [C]// Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton: ACM, 2002: 475–480.
|
|
|
| [9] |
BUSCALDI D, ROSSO P, GÓMEZ-SORIANO J M, et al Answering questions with an n-gram based passage retrieval engine[J]. Journal of Intelligent Information Systems, 2010, 34 (2): 113- 134
doi: 10.1007/s10844-009-0082-y
|
|
|
| [10] |
SARAWAGI S, BHAMIDIPATY A. Interactive deduplication using active learning [C]// Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton: ACM, 2002: 269–278.
|
|
|
| [11] |
ARASU A, GÖTZ M, KAUSHIK R. On active learning of record matching packages [C]// Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. Indianapolis: ACM, 2010: 783–794.
|
|
|
| [12] |
JEAN-MARY Y R, SHIRONOSHITA E P, KABUKA M R. ASMOV: results for OAEI 2010 [C]// Proceedings of the 5th International Workshop on Ontology Matching (OM 2010). Shanghai: [s.n.], 2010: 114−121.
|
|
|
| [13] |
SUCHANEK F M, ABITEBOUL S, SENELLART P. PARIS: probabilistic alignment of relations, instances, and schema [EB/OL]. (2011−11−30)[2025−03−05]. https://arxiv.org/pdf/1111.7164.
|
|
|
| [14] |
LACOSTE-JULIEN S, PALLA K, DAVIES A, et al. SiGMa: simple greedy matching for aligning large knowledge bases [C]// Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Chicago: ACM, 2013: 572−580.
|
|
|
| [15] |
SONG D, LUO Y, HEFLIN J Linking heterogeneous data in the semantic web using scalable and domain-independent candidate selection[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (1): 143- 156
doi: 10.1109/TKDE.2016.2606399
|
|
|
| [16] |
CHEN M, TIAN Y, YANG M, et al. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment [EB/OL]. (2017−05−17)[2025−03−05]. https://arxiv.org/pdf/1611.03954.
|
|
|
| [17] |
FEY M, LENSSEN J E, MORRIS C, et al. Deep graph matching consensus [EB/OL]. (2020−01−27)[2025−03−05]. https://arxiv.org/pdf/2001.09621.
|
|
|
| [18] |
CAO Y, LIU Z, LI C, et al. Multi-channel graph neural network for entity alignment [EB/OL]. (2019−08−26)[2025−03−05]. https://arxiv.org/pdf/1908.09898.
|
|
|
| [19] |
SUN Z, HU W, ZHANG Q, et al. Bootstrapping entity alignment with knowledge graph embedding [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: ACM, 2018: 4396–4402.
|
|
|
| [20] |
MAO X, WANG W, WU Y, et al. Are negative samples necessary in entity alignment? An approach with high performance, scalability and robustness [C]// Proceedings of the 30th ACM International Conference on Information and Knowledge Management. [S.l.]: ACM, 2021: 1263−1273.
|
|
|
| [21] |
MAO X, WANG W, XU H, et al. MRAEA: an efficient and robust entity alignment approach for cross-lingual knowledge graph [C]// Proceedings of the 13th International Conference on Web Search and Data Mining. Houston: ACM, 2020: 420−428.
|
|
|
| [22] |
LIU Z, CAO Y, PAN L, et al. Exploring and evaluating attributes, values, and structures for entity alignment [EB/OL]. (2021−01−02)[2025−03−05]. https://arxiv.org/pdf/2010.03249.
|
|
|
| [23] |
WU Y, LIU X, FENG Y, et al. Relation-aware entity alignment for heterogeneous knowledge graphs [EB/OL]. (2019−08−22)[2025−03−05]. https://arxiv.org/pdf/1908.08210.
|
|
|
| [24] |
WU Y, LIU X, FENG Y, et al. Jointly learning entity and relation representations for entity alignment [EB/OL]. (2019−09−20)[2025−03−05]. https://arxiv.org/pdf/1909.09317.
|
|
|
| [25] |
CHEN M, SHI W, ZHOU B, et al. Cross-lingual entity alignment with incidental supervision [EB/OL]. (2021−01−26)[2025−03−05]. https://arxiv.org/pdf/2005.00171.
|
|
|
| [26] |
WANG Z, YANG J, YE X. Knowledge graph alignment with entity-pair embedding [C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. [S.l.]: ACL, 2020: 1672−1680.
|
|
|
| [27] |
TANG J, ZHAO K, LI J. A fused Gromov-Wasserstein framework for unsupervised knowledge graph entity alignment [EB/OL]. (2023−05−11)[2025−03−05]. https://arxiv.org/pdf/2305.06574.
|
|
|
| [28] |
ZHAO Y, WU Y, CAI X, et al. From alignment to entailment: a unified textual entailment framework for entity alignment [C]// Findings of the Association for Computational Linguistics. Toronto: ACL, 2023: 8795−8806.
|
|
|
| [29] |
PATRINI G, VAN DEN BERG R, FORRE P, et al. Sinkhorn autoencoders [C]// 35th Uncertainty in Artificial Intelligence Conference. Toronto: PMLR, 2020: 733−743.
|
|
|
| [30] |
HAMUDA E, MC GINLEY B, GLAVIN M, et al Improved image processing-based crop detection using Kalman filtering and the Hungarian algorithm[J]. Computers and Electronics in Agriculture, 2018, 148: 37- 44
doi: 10.1016/j.compag.2018.02.027
|
|
|
| [31] |
GRANGER S, BESTGEN Y The use of collocations by intermediate vs. advanced non-native writers: a bigram-based study[J]. International Review of Applied Linguistics in Language Teaching, 2014, 52 (3): 229- 252
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|