JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2024, Vol. 59 ›› Issue (7): 76-84.doi: 10.6040/j.issn.1671-9352.1.2023.097

• Review • Previous Articles     Next Articles

Factual error detection in knowledge graphs based on dynamic neighbor selection

Liang GUI1,2(),Yao XU1,2,Shizhu HE1,2,*(),Yuanzhe ZHANG1,2,*(),Kang LIU1,2,Jun ZHAO1,2   

  1. 1. The Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
    2. School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-10-18 Online:2024-07-20 Published:2024-07-15
  • Contact: Shizhu HE,Yuanzhe ZHANG E-mail:guiliang21@mails.ucas.an.cn;shizhu.he@nlpr.ia.ac.cn;yzzhang@nlpr.ia.ac.cn

Abstract:

The construction and updating of the knowledge graph(KG) usually depend on a wide range of web data and automated methods, inevitably resulting in factual inaccuracies in the modeled and acquired knowledge. To tackle this problem, a novelapproach for identifying factual inaccuracies within the knowledge graph is proposed. This method actively selects adjacent nodes of the facts to be checked, detecting errors by measuring the intricate associations linking the head and tail entities. More specifically, it first utilizes graph structure information to identify potential neighbors for each entity. Then, based on contextual information, it dynamically selects relevant neighbors and uses an efficient graph attention network to encode node features. Finally, by calculating the consistency of head and tail entity representations, it determines if the fact under consideration is erroneous. Experimental results on multiple public KG datasets demonstrate that this method outperforms existing approaches in error detection.

Key words: knowledge graph, fact error detection, knowledge graphembedding, quality control, dynamic neighbor selection

CLC Number: 

  • TP391.1

Fig.1

Framework of the proposed DyNED"

Table 1

Statistics of the datasets"

数据集 三元组/条 实体/个 关系/种
WN18RR 93 003 40 943 11
NELL-995 154 213 75 492 200
FB15k-237 310 116 14 541 237

Table 2

Fact errordetection results of Precision@K and Recall@K on three datasets with an anomaly rate of 5%"

评价指标 方法 WN18RR NELL-995 FB15k-237
K=1% K=2% K=3% K=4% K=5% K=1% K=2% K=3% K=4% K=5% K=1% K=2% K=3% K=4% K=5%
Precision@KTransE 0.581 0.488 0.371 0.345 0.331 0.659 0.550 0.476 0.423 0.383 0.756 0.674 0.605 0.546 0.488
ComplEX 0.518 0.444 0.382 0.341 0.307 0.627 0.538 0.472 0.427 0.378 0.718 0.651 0.590 0.534 0.485
DistMult 0.574 0.451 0.390 0.349 0.322 0.630 0.553 0.493 0.446 0.408 0.709 0.646 0.582 0.529 0.483
KGTtm 0.770 0.628 0.516 0.444 0.396 0.808 0.691 0.602 0.535 0.481 0.815 0.767 0.713 0.612 0.579
CAGED 0.826 0.726 0.632 0.541 0.469 0.850 0.736 0.644 0.573 0.516 0.852 0.796 0.735 0.665 0.595
DyNED 0.924 0.808 0.697 0.608 0.539 0.893 0.806 0.696 0.636 0.598 0.918 0.856 0.786 0.716 0.648
Recall@K TransE 0.116 0.195 0.233 0.276 0.331 0.132 0.220 0.285 0.338 0.383 0.151 0.270 0.363 0.437 0.488
ComplEX 0.103 0.177 0.229 0.273 0.307 0.125 0.215 0.283 0.341 0.378 0.143 0.260 0.354 0.427 0.485
DistMult 0.114 0.180 0.234 0.279 0.322 0.126 0.221 0.295 0.357 0.408 0.141 0.258 0.349 0.423 0.483
KGTtm 0.154 0.251 0.309 0.355 0.396 0.161 0.276 0.361 0.428 0.481 0.163 0.307 0.428 0.490 0.579
CAGED 0.165 0.290 0.379 0.433 0.469 0.170 0.294 0.386 0.459 0.516 0.171 0.318 0.441 0.532 0.595
DyNED 0.185 0.323 0.418 0.486 0.539 0.178 0.342 0.472 0.483 0.598 0.184 0.342 0.472 0.573 0.648

Table 3

Error detection results of each module in WN18RR and NELL-995 datasets with an anomaly rate of 5%"

评价指标 方法 WN18RR NELL-995
K=1% K=2% K=3% K=4% K=5% K=1% K=2% K=3% K=4% K=5%
Precision@K DyNED 0.924 0.808 0.697 0.608 0.539 0.893 0.806 0.696 0.636 0.598
DyNED_Local 0.653 0.571 0.497 0.446 0.406 0.702 0.638 0.564 0.483 0.439
DyNED_Global 0.738 0.623 0.538 0.477 0.435 0.762 0.679 0.612 0.532 0.491
Recall@K DyNED 0.185 0.323 0.418 0.486 0.539 0.178 0.342 0.472 0.483 0.598
DyNED_Local 0.135 0.228 0.291 0.357 0.406 0.145 0.257 0.323 0.402 0.439
DyNED_Global 0.143 0.247 0.315 0.382 0.435 0.156 0.286 0.398 0.447 0.491
1 ZHANG Yongfeng, AI Qingyao, CHEN Xu, et al. Learning over knowledge-base embeddings for recommendation[EB/OL]. (2018-03-17)[2023-10-18]. http://arxiv.org/abs/1803.06540.
2 WANG Hongwei, ZHANG Fuzheng, ZHANG Mengdi, et al. Knowledge-aware graph neural networks with label smoothness regularization for recommender systems[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage: ACM, 2019: 968-977.
3 JUNG J, SON B, LYU S. Attnio: knowledge graph exploration with in-and-out attention flow for knowledge-grounded dialogue[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg: ACL, 2020: 3484-3497.
4 GU Yuxian , WEN Jiaxin , SUN Hao , et al. Eva2. 0: investigating open-domain chinese dialogue systems with large-scale pre-training[J]. Machine Intelligence Research, 2023, 20 (2): 207- 219.
doi: 10.1007/s11633-022-1387-3
5 SUCHANEK F M, KASNECI G, WEIKUN G. Yago: a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web. Albert: ACM, 2007: 697-706.
6 BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008: ACM SIGMOD International Conference on Management of Data. Vancouver: ACM, 2008: 1247-1250.
7 LEHMANN J , ISELE R , JAKOB M , et al. Dbpedia: a large-scale, multilingual knowledge base extracted from wikipedia[J]. Semantic Web, 2015, 6 (2): 167- 195.
doi: 10.3233/SW-140134
8 CARLSON A , BETTERIDGE J , KISIEL B , et al. Toward an architecture for never-ending language learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2010, 24 (1): 1306- 1313.
doi: 10.1609/aaai.v24i1.7519
9 CHEN Chaochao, ZHENG Fei, CUI Jiamie, et al. Survey and open problems in privacy-preserving knowledge graph: merging, query, representation, completion, and applications[J/OL]. International Journal of Machine Learning and Cybernetics, 2024: 1-20. https://link.springer.com/article/10.1007/s13042-024-02106-6.
10 YANG Yuji , XU Bin , HU Jiawei , et al. Accurate and efficient method for constructing domain knowledge graph[J]. Journal of Software, 2018, 29 (10): 2931- 2947.
11 LAO N , COHEN W W . Relational retrieval using a combination of path-constrained random walks[J]. Machine Learning, 2010, 81, 53- 67.
doi: 10.1007/s10994-010-5205-8
12 WANG Zhen , ZHANG Jiawen , FENG Jianlin , et al. Knowledge graph embedding by translating on hyperplanes[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2014, 28 (1): 112- 119.
13 ZHANG Qinggang, DONG Junnan, DUAN Keyu, et al. Contrastive knowledge graph error detection[C]//Proceedings of the 31st ACM International Conference on Information & Knowledge Management. Atlanta: ACM, 2022: 2590-2599.
14 BRODY S, ALON U, YAHAV E. How attentive are graph attention networks?[EB/OL]. (2021-10-11)[2023-10-18]. http://arxiv.org/abs/2105.14491.
15 RUI Yong , CARMONA V I S , POURVALI M , et al. Knowledge mining: a cross-disciplinary survey[J]. Machine Intelligence Research, 2022, 19 (2): 89- 114.
doi: 10.1007/s11633-022-1323-6
16 REDDY H , RAJ N , GALA M , et al. Text-mining-based fake news detection using ensemble methods[J]. International Journal of Automation and Computing, 2020, 17 (2): 210- 221.
doi: 10.1007/s11633-019-1216-5
17 GALARRAGE L A, TEFLIOUDI C, HOSE K, et al. AMIE: association rule mining under incomplete evidence in ontological knowledge bases[C]//Proceedings of the 22nd International Conference on World Wide Web. Rio de Janeiro: ACM, 2013: 413-422.
18 GALAEEAFE L , TEFLIOUDI C , HOSE K , et al. Fast rule mining in ontological knowledge bases with AMIE ++[J]. The VLDB Journal, 2015, 24 (6): 707- 730.
doi: 10.1007/s00778-015-0394-1
19 CHENG Yurong, CHEN Lei, YUAN Ye, et al. Rule-based graph repairing: semantic and efficient repairing methods[C]//2018 IEEE 34th International Conference on Data Engineering (ICDE). Paris: IEEE, 2018: 773-784.
20 GUO Shu , WANG Quan , WANG Lihong , et al. Knowledge graph embedding with iterative guidance from soft rules[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32 (1): 4816- 4823.
21 GUO Shu, WANG Quan, WANG Lilong, et al. Jointly embedding knowledge graphs and logical rules[J]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Stroundsburg: Association for Computational Linguistics, 2016: 192-202.
22 TROUILLON T, WELBI J, RIEDEL S, et al. Complex embeddings for simple link prediction[C]//International Conference on Machine Learning. New York: ACM, 2016: 2071-2080.
23 LIN Yankai, LIU Zhiyuan, SUN Maosong, et al. Learning entity and relation embeddings for knowledge graph completion[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Astin: ACM, 2015, 29(1): 2181-2187.
24 FAN Miao, ZHOU Qiang, CHANG E, et al. Transition-based knowledge graph embedding with relational mapping properties[C]//Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing. Phuket: Association for Computational Linguistics, 2014: 328-337.
25 JIA Shengbin, XIANG Yang, CHEN Xiaojun, et al. TTMF: a triple trustworthiness measurement frame for knowledge graphs[EB/OL]. (2018-11-06)[2023-10-18]. http://arxiv.org/abs/1809.09414.
26 YANG B S, YIH W, HE X D, et al. Embedding entities and relations for learning and inference in knowledge bases[EB/OL]. (2014-12-27)[2023-10-18]. http://arXiv preprint arXiv:1412.6575.
[1] LI Chao, LIAO Wei. Chinese disease text classification model driven by medical knowledge [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(7): 122-130.
[2] Zequn NIU,Xiaoge LI,Chengyu QIANG,Wei HAN,Yi YAO,Yang LIU. Entity disambiguation method based on graph attention networks [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(3): 71-80, 94.
[3] Yujia NA,Jun XIE,Haiyang YANG,Xinying XU. Context fusion-based knowledge graph completion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(9): 71-80.
[4] MA Fei-xiang, LIAO Xiang-wen, YU Zhi-yong, WU Yun-bing, CHEN Guo-long. A text opinion retrieval method based on knowledge graph [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(11): 33-40.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] DING Chao1, 2, YUAN Chang-an1, 3, QIN Xiao1, 3. A prediction algorithm for multi-data streams  based on GEP[J]. J4, 2010, 45(7): 50 -54 .
[2] ZHANG De-yu,ZHAI Wen-guang . [J]. J4, 2006, 41(5): 4 -07 .
[3] WANG Ting-ming,LI Bo-tang . Proof of a class of matrix rank identities[J]. J4, 2007, 42(2): 43 -45 .
[4] FU Yonghong 1, YU Miaomiao 2*, TANG Yinghui 3, LI Cailiang 4. [J]. J4, 2009, 44(4): 72 -78 .
[5] ZOU Guo-ping1, MA Ru-ning1, DING Jun-di2, ZHONG Bao-jiang3. Image retrieval based on saliency weighted color and texture[J]. J4, 2010, 45(7): 81 -85 .
[6] CHEN Li . The robust fault diagnosis filter design for uncertain singular systems[J]. J4, 2007, 42(7): 62 -65 .
[7] GUO Lan-lan1,2, GENG Jie1, SHI Shuo1,3, YUAN Fei1, LEI Li1, DU Guang-sheng1*. Computing research of the water hammer pressure in the process of #br# the variable speed closure of valve based on UDF method[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 27 -30 .
[8] SHI Kai-quan. P-information law intelligent fusion and soft information #br# image intelligent generation[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(04): 1 -17 .
[9] ZHANG Ling ,ZHOU De-qun . Research on the relationships among the λ fuzzy measures, Mbius representation and interaction representation[J]. J4, 2007, 42(7): 33 -37 .
[10] ZENG Weng-fu1, HUANG Tian-qiang1,2, LI Kai1, YU YANG-qiang1, GUO Gong-de1,2. A local linear emedding agorithm based on harmonicmean geodesic kernel[J]. J4, 2010, 45(7): 55 -59 .