知识图谱增强的三阶段相关工作生成方法

doi:10.6040/j.issn.1671-9352.1.2025.051

摘要/Abstract

摘要： 将检索步骤引入自动化相关工作生成任务,提出一种基于知识图谱增强的“规划-检索-生成”三阶段相关工作生成方法,旨在解决现有的端到端生成技术因忽视学术写作结构化思维导致的主题偏移和关键文献遗漏问题。通过引入知识图谱增强规划模块,系统能够捕捉多跳关联关键词,提升研究主题建模的全面性。实验结果表明,该方法在生成质量上较直接生成方法提升4倍,与传统检索增强生成(RAG)方法相比,提升89%。此外,整体较低的文献覆盖率表明规划增强检索是自动化相关工作生成的重要研究方向。

关键词: 大语言模型, 知识图谱, 相关工作生成

Abstract: This study introduced retrieval steps into automated related work generation, proposing a knowledge graph-enhanced three-stage framework(planning-retrieval-generation)to address topic drift and key reference omission in existing end-to-end approaches. The knowledge graph-augmented planning module captured multi-hop keyword relationships for comprehensive topic modeling. Experimental results demonstrated a fourfold improvement over direct generation methods and 89% over conventional RAG approaches. The overall low literature coverage indicated that planning-enhanced retrieval remains crucial for automated related work generation.

Key words: large language models, knowledge graph, related work generation

中图分类号:

TP391

谢安喆,艾清遥,刘奕群,苏炜航,毛佳昕,张敏,马少平. 知识图谱增强的三阶段相关工作生成方法[J]. 《山东大学学报(理学版)》, 2026, 61(6): 1-12.

XIE Anzhe, AI Qingyao, LIU Yiqun, SU Weihang, MAO Jiaxin, ZHANG Min, MA Shaoping. Knowledge graph-enhanced three-stage related work generation[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2026, 61(6): 1-12.

参考文献

[1] FAN Y, GUO J, LAN Y, et al. Modeling diverse relevance patterns in ad-hoc retrieval[C] //The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM, 2018:375-384.
[2] MA X, GUO J, ZHANG R, et al. PROP: pre-training with representative words prediction for ad-hoc retrieval[C] //Proceedings of the 14th ACM International Conference on Web Search and Data Mining. New York: ACM, 2021:283-291.
[3] CHEN X, ALAMRO H, LI M, et al. Target-aware abstractive related work generation with contrastive learning[EB/OL].(2022-05-26)[2026-01-20]. https://arxiv.org/abs/2205.13339.
[4] LIU J C, ZHANG Q, SHI C Y, et al. Causal intervention for abstractive related work generation[C] //Findings of the Association for Computational Linguistics: EMNLP 2023. Singapore: ACL, 2023:2148-2159.
[5] MARTIN-BOYLE A, TYAGI A, HEARST M A, et al. Shallow synthesis of knowledge in GPT-generated texts: a case study in automatic related work composition[EB/OL].(2022-02-19)[2026-01-16]. https://arxiv.org/abs/2402.12255.
[6] LI X C, OUYANG J. Explaining relationships among research papers[EB/OL].(2024-02-20)[2026-01-16]. https://arxiv.org/abs/2402.13426.
[7] HOANG C D V, KAN M Y, et al. Towards automated related work summarization[C] //Proceedings of the 23rd International Conference on Computational Linguistics: Beijing: ACM, 2010:427-435.
[8] TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models[EB/OL].(2023-02-27)[2026-01-20]. https://arxiv.org/abs/2302.13971.
[9] AN Y, YANG B S, ZHANG B C. Qwen2.5 technical report[EB/OL].(2024-12-19)[2026-01-20]. https://arxiv.org/abs/2412.15115.
[10] LYU Y J, NIU Z H, XIE Z Y, et al. Retrieve-plan-generation: an iterative planning and answering framework for knowledge-intensive LLM generation[EB/OL].(2024-06-21)[2026-01-20]. https://arxiv.org/abs/2406.14979.
[11] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[12] WHITE J, FU Q C, HAYS S, et al. A prompt pattern catalog to enhance prompt engineering with ChatGPT[EB/OL].(2023-02-21)[2026-01-20]. https://arxiv.org/abs/2302.11382.
[13] LIU J C, SHEN D H, ZHANG Y Z, et al. What makes good in-context examples for GPT-3?[EB/OL].(2021-01-17)[2026-01-20]. https://arxiv.org/abs/2101.06804.
[14] DONG Q X, LI L, DAI D M, et al. A survey on in-context learning[EB/OL].(2022-12-31)[2026-01-20]. https://arxiv.org/abs/2301.00234.
[15] JIN B W, ZENG H S, YUE Z R, et al. Search-R1: training LLMs to reason and leverage search engines with reinforcement learning[EB/OL].(2025-03-12)[2026-01-20]. https://arxiv.org/abs/2503.09516.
[16] GAO Y F, XIONG Y, GAO X Y, et al. Retrieval-augmented generation for large language models: a survey[EB/OL].(2023-12-18)[2026-01-20]. https://arxiv.org/abs/2312.10997.
[17] LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33:9459-9474.
[18] SU W H, TANG Y C, AI Q Y, et al. DRAGIN: dynamic retrieval augmented generation based on the information needs of large language models[EB/OL].(2024-03-15)[2026-01-20]. https://arxiv.org/abs/2403.10081.
[19] SHUSTER K, POFF S, CHEN M Y, et al. Retrieval augmentation reduces hallucination inconversation[EB/OL].(2021-04-15)[2026-01-20]. https://arxiv.org/abs/2104.07567.
[20] SU W H, WANG C Y, AI Q Y, et al. Unsupervised real-time hallucination detection based on the internal states of large language models[EB/OL].(2024-03-11)[2026-01-20]. https://arxiv.org/abs/2403.06448.
[21] ABURAED A, SAGGION H, SHVETS A, et al. Automatic related work section generation: experiments in scientific document abstracting[J]. Scientometrics, 2020, 125(3):3159-3185.
[22] MANDAL B, LI X C, OUYANG J. Contextualizing generated citationtexts[EB/OL].(2024-02-28)[2026-01-20]. https://arxiv.org/abs/2402.18054.
[23] SHAH D J, BARZILAY R. Generating related work[EB/OL].(2021-04-18)[2026-01-20]. https://arxiv.org/abs/2104.08668.
[24] HU Y T, LI Z F, ZHANG Z, et al. Taxonomy tree generation from citation graph[EB/OL].(2024-10-02)[2026-01-20]. https://arxiv.org/abs/2410.03761.
[25] HU Y, WAN X. Automatic generation of related work sections in scientific papers: an optimization approach[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: ACL, 2014:1624-1633.
[26] WANG P C, LI S S, ZHOU H F, et al. ToC-RWG: explore the combination of topic model and citation information for automatic related work generation[J]. IEEE Access, 2020, 8:13043-13055.
[27] REN H Y, DAI H J, DAI B, et al. SMORE: knowledge graph completion and multi-hop reasoning in massive knowledge graphs[C] //Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2022:1472-1482.
[28] ZHANG N Y, DENG S M, SUN Z L, et al. Long-tail relation extraction via knowledge graph embeddings and graph convolution networks[EB/OL].(2019-03-04)[2026-01-20]. https://arxiv.org/abs/1903.01306.
[29] SAHLAB N, KAHOUL H, JAZDI N, et al. A knowledge graph-based method for automating systematic literaturereviews[EB/OL].(2022-07-06)[2026-01-20]. https://arxiv.org/abs/2208.02334.
[30] EDGE D, TRINH H, CHENG N, et al. From local to global: a graph RAG approach to query-focused summarization[EB/OL].(2024-04-24)[2026-01-20]. https://arxiv.org/abs/2404.16130.
[31] AJITH A, XIA M Z, CHEVALIER A, et al. LitSearch: a retrieval benchmark for scientific literature search[EB/OL].(2024-07-10)[2026-01-20]. https://arxiv.org/abs/2407.18940.
[32] MUENNIGHOFF N, SU H J, WANG L, et al. Generative representational instruction tuning[EB/OL].(2024-02-15)[2026-01-20]. https://arxiv.org/abs/2402.09906.
[33] KARPUKHIN V, OGUZ B, MIN S, et al. Dense passage retrieval for open-domain question answering[C] //Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. [S.L.] : Online Association for Computational Linguistics, 2020:6769-6781.
[34] BELTAGY I, LO K, COHAN A. SciBERT: a pretrained language model for scientific text[C] //Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Hong Kong: Association for Computational Linguistics, 2019:3615-3620.
[35] CARBONELL J, GOLDSTEIN J. The use of MMR, diversity-based reranking for reordering documents and producing summaries[J]. ACM SIGIR Forum, 2017, 51(2):209-210.
[36] DOUZE M, GUZHVA A, DENG C Q, et al. Thefaiss library[EB/OL].(2024-01-20)[2026-01-20]. https://arxiv.org/abs/2401.08281.
[37] JOHNSON J, DOUZE M, JÉGOU H. Billion-scale similarity search with GPUs[J]. IEEE Transactions on Big Data, 2021, 7(3):535-547.
[38] SALTON G, BUCKLEY C. Term-weighting approaches in automatic text retrieval[J]. Information Processing & Management, 1988, 24(5):513-523.
[39] PAGE L, BRIN S, MOTWANI R, et al. The PageRank citation ranking: bringing order to the web[C] //The Web Conference. Stanford: Stanford Info Lab, 1999:1-17
[40] QWEN T. Qwen2.5: a party of foundation models[EB/OL].(2024-12-19)[2026-01-20]. https://qwenlm.github.io/blog/qwen2.5/.
[41] YANG A, YANG B S, HUI B Y, et al. Qwen2 technical report[EB/OL].(2024-07-15)[2026-01-20]. https://arxiv.org/abs/2407.10671.
[42] SU W, XIE A, AI Q, et al. Benchmarking computer science survey generation[EB/OL].(2025-08-21)[2026-01-20]. https://github.com/oneal2000/SurGE.
[43] HE P C, GAO J F, CHEN W Z. DeBERTaV3: improving DeBERT a using ELECTRA-style pre-training with gradient-disentangled embedding sharing[EB/OL].(2021-11-18)[2026-01-20]. https://arxiv.org/abs/2111.09543.
[44] HE P C, LIU X D, GAO J F, et al. DeBERTa: decoding-enhanced BERT with disentangled attention[EB/OL].(2020-06-05)[2026-01-20]. https://arxiv.org/abs/2006.03654.
[45] LIN C Y. ROUGE: a package for automatic evaluation of summaries[C] //Proceedings of the Text Summarization Branches Out Workshop. Barcelona: ACL, 2004:74-81.
[46] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C] //Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia: ACL, 2002:311-318.
[47] LI H T, DONG Q, CHEN J J, et al. LLMs-as-judges: a comprehensive survey on LLM-based evaluation methods[EB/OL].(2024-12-07)[2026-01-20]. https://arxiv.org/abs/2412.05579.
[48] GU J W, JIANG X H, SHI Z C, et al. A survey on LLM-as-a-judge[EB/OL].(2024-11-23)[2026-01-20]. https://arxiv.org/abs/2411.15594.
[49] ZHENG L M, CHIANG W L, SHENG Y, et al. Judging LLM-as-a-judge with MT-bench and chatbot arena[EB/OL].(2023-06-09)[2026-01-20]. https://arxiv.org/abs/2306.05685.
[50] YE F D, LI S Y, ZHANG Y Q, et al. R²AG: incorporating retrieval information into retrieval augmented generation[EB/OL].(2024-06-19)[2026-01-20]. https://arxiv.org/abs/2406.13249.
[51] MERTH T, FU Q C, RASTEGARI M, et al. Superposition prompting: improving and accelerating retrieval-augmented generation[EB/OL].(2024-04-10)[2026-01-20]. https://arxiv.org/abs/2404.06910.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed