JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2024, Vol. 59 ›› Issue (7): 76-84.doi: 10.6040/j.issn.1671-9352.1.2023.097

Factual error detection in knowledge graphs based on dynamic neighbor selection

Liang GUI1,2(),Yao XU1,2,Shizhu HE1,2,*(),Yuanzhe ZHANG1,2,*(),Kang LIU1,2,Jun ZHAO1,2   

  1. 1. The Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
    2. School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-10-18 Online:2024-07-20 Published:2024-07-15
  Contact: Shizhu HE,Yuanzhe ZHANG


The construction and updating of the knowledge graph(KG) usually depend on a wide range of web data and automated methods, inevitably resulting in factual inaccuracies in the modeled and acquired knowledge. To tackle this problem, a novelapproach for identifying factual inaccuracies within the knowledge graph is proposed. This method actively selects adjacent nodes of the facts to be checked, detecting errors by measuring the intricate associations linking the head and tail entities. More specifically, it first utilizes graph structure information to identify potential neighbors for each entity. Then, based on contextual information, it dynamically selects relevant neighbors and uses an efficient graph attention network to encode node features. Finally, by calculating the consistency of head and tail entity representations, it determines if the fact under consideration is erroneous. Experimental results on multiple public KG datasets demonstrate that this method outperforms existing approaches in error detection.

Key words: knowledge graph, fact error detection, knowledge graphembedding, quality control, dynamic neighbor selection

Table 1

Statistics of the datasets"

数据集 三元组/条 实体/个 关系/种
WN18RR 93 003 40 943 11
NELL-995 154 213 75 492 200
FB15k-237 310 116 14 541 237

Table 2

Fact errordetection results of Precision@K and Recall@K on three datasets with an anomaly rate of 5%"

评价指标 方法 WN18RR NELL-995 FB15k-237
K=1% K=2% K=3% K=4% K=5% K=1% K=2% K=3% K=4% K=5% K=1% K=2% K=3% K=4% K=5%
Precision@KTransE 0.581 0.488 0.371 0.345 0.331 0.659 0.550 0.476 0.423 0.383 0.756 0.674 0.605 0.546 0.488
ComplEX 0.518 0.444 0.382 0.341 0.307 0.627 0.538 0.472 0.427 0.378 0.718 0.651 0.590 0.534 0.485
DistMult 0.574 0.451 0.390 0.349 0.322 0.630 0.553 0.493 0.446 0.408 0.709 0.646 0.582 0.529 0.483
KGTtm 0.770 0.628 0.516 0.444 0.396 0.808 0.691 0.602 0.535 0.481 0.815 0.767 0.713 0.612 0.579
CAGED 0.826 0.726 0.632 0.541 0.469 0.850 0.736 0.644 0.573 0.516 0.852 0.796 0.735 0.665 0.595
DyNED 0.924 0.808 0.697 0.608 0.539 0.893 0.806 0.696 0.636 0.598 0.918 0.856 0.786 0.716 0.648
Recall@K TransE 0.116 0.195 0.233 0.276 0.331 0.132 0.220 0.285 0.338 0.383 0.151 0.270 0.363 0.437 0.488
ComplEX 0.103 0.177 0.229 0.273 0.307 0.125 0.215 0.283 0.341 0.378 0.143 0.260 0.354 0.427 0.485
DistMult 0.114 0.180 0.234 0.279 0.322 0.126 0.221 0.295 0.357 0.408 0.141 0.258 0.349 0.423 0.483
KGTtm 0.154 0.251 0.309 0.355 0.396 0.161 0.276 0.361 0.428 0.481 0.163 0.307 0.428 0.490 0.579
CAGED 0.165 0.290 0.379 0.433 0.469 0.170 0.294 0.386 0.459 0.516 0.171 0.318 0.441 0.532 0.595
DyNED 0.185 0.323 0.418 0.486 0.539 0.178 0.342 0.472 0.483 0.598 0.184 0.342 0.472 0.573 0.648

Table 3

Error detection results of each module in WN18RR and NELL-995 datasets with an anomaly rate of 5%"

评价指标 方法 WN18RR NELL-995
K=1% K=2% K=3% K=4% K=5% K=1% K=2% K=3% K=4% K=5%
Precision@K DyNED 0.924 0.808 0.697 0.608 0.539 0.893 0.806 0.696 0.636 0.598
DyNED_Local 0.653 0.571 0.497 0.446 0.406 0.702 0.638 0.564 0.483 0.439
DyNED_Global 0.738 0.623 0.538 0.477 0.435 0.762 0.679 0.612 0.532 0.491
Recall@K DyNED 0.185 0.323 0.418 0.486 0.539 0.178 0.342 0.472 0.483 0.598
DyNED_Local 0.135 0.228 0.291 0.357 0.406 0.145 0.257 0.323 0.402 0.439
DyNED_Global 0.143 0.247 0.315 0.382 0.435 0.156 0.286 0.398 0.447 0.491
