《山东大学学报(理学版)》 ›› 2021, Vol. 56 ›› Issue (5): 76-84.doi: 10.6040/j.issn.1671-9352.1.2020.019
• • 上一篇
王雪彦1,2,3,何婷婷1,2,3*,黄翔4,王俊美5,潘敏6
WANG Xue-yan1,2,3, HE Ting-ting1,2,3*, HUANG Xiang4, WANG Jun-mei5, PAN Min6
摘要: 提出了一种基于文档内位置关系的伪相关反馈框架LRoc(location-based Rocchio framework)。该框架采用不同的核函数对候选词项在反馈文档中的位置进行建模,得到候选扩展词的位置重要度,并将其应用到经典的Rocchio模型中。该方法在选择和评估候选扩展词时,不仅考虑了词频,也考虑了词项位置的影响,有助于获取与查询更可能相关的扩展词。最后,在5种TREC数据集的实验结果表明:基于LRoc框架提出的3种模型(LRoc1、LRoc2和LRoc3)对比基线模型在MAP和P@20指标上具有显著提升。
中图分类号:
[1] WANG Junmei, PAN Min, HE Tingting, et al. A pseudo-relevance feedback framework combining relevance matching and semantic matching for information retrieval[J]. Information Processing & Management, 2020, 57(6):102342. [2] PAN Min, HUANG Jimmy Xiangji, HE Tingting, et al. A simple kernel co-occurrence-based enhancement for pseudo-relevance feedback[J]. Journal of the Association for Information Science and Technology, 2020, 71(3):264-281. [3] PAN Min, ZHANG Yue, ZHU Qiang, et al. An adaptive term proximity based Rocchios model for clinical decision support retrieval[J]. BMC Medical Informatics and Decision Making, 2019, 19(Suppl9):251. doi: 10.1186/s12911-019-0986-6. [4] SCOLLON R. Eight legs and one elbow: stance and structure in Chinese English compositions[C] //International Reading Association, Second North American Conference on Adult and Adolescent Literacy. Banff:[s.n.] , 1991: 21. [5] 蔡基刚. 英汉文章中心思想表达位置差异及其对中国学生英语写作影响[J]. 国外外语教学, 2007(1):1-7. CAI Jigang. The difference in the expression of central ideas in English and Chinese articles and its influence on Chinese students English writing[J]. Foreign Language Learning, 2007(1):1-7. [6] TYNE J L. Fundamentals of good writing: a handbook of modern rhetoric[J]. Thought: Fordham University Quarterly, 1952, 27(3):462-464. [7] LV Y, ZHAI C X. Positional relevance model for pseudo-relevance feedback[C] //Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2010: 579-586. [8] ZHAO J, HUANG J X, HE B. CRTER: using cross terms to enhance probabilistic information retrieval[C] //Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2011: 155-164. [9] MIAO J, HUANG J X, YE Z. Proximity-based rocchios model for pseudo relevance[C] //Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2012: 535-544. [10] ZHAO J, HUANG J X, WU S. Rewarding term location information to enhance probabilistic information retrieval[C] //Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2012: 1137-1138. [11] ROCCHIO J J. Relevance feedback in information retrieval[M] //The Smart Retrieval System-experiments in Automatic Document Processing. Englewood Cliffs: Prentice-Hall, 1971: 313-323. [12] CHEN Q, HU Q, HUANG J X, et al. Enhancing recurrent neural networks with positional attention for question answering[C] //Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2017: 993-996. [13] SONG R, YU L, WEN J R, et al. A proximity probabilistic model for information retrieval[J]. Microsoft Research, 2011. https://www.researchgate.net/publication/228731320_A_Proximity_Probabilistic_Model_for_Information_Retrieval. [14] GIACHANOU A, CRESTANI F. Opinion retrieval in Twitter: is proximity effective[C] //Proceedings of the 31st Annual ACM Symposium on Applied Computing. New York: ACM, 2016: 1146-1151. [15] EHSAN N, SHAKERY A. Candidate document retrieval for cross-lingual plagiarism detection using two-level proximity information[J]. Information Processing & Management, 2016, 52(6):1004-1017. [16] LV Y, ZHAI C X. Positional language models for information retrieval[C] //Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2009: 299-306. [17] YE Z, HUANG J X. A simple term frequency transformation model for effective pseudo relevance feedback[C] //Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2014: 323-332. |
[1] | 唐亮,赵晓峰,席耀一,易绵竹. 融合局部共现和上下文相似度的查询扩展方法[J]. 山东大学学报(理学版), 2017, 52(1): 29-36. |
[2] | 孟烨,张鹏,宋大为. 探索数据集特征与伪相关反馈的平衡参数之间的关系[J]. 山东大学学报(理学版), 2016, 51(7): 18-22. |
[3] | 徐也,徐蔚然. 基于语义特征扩展的知识库增量引文推荐算法[J]. 山东大学学报(理学版), 2016, 51(11): 26-32. |
[4] | 马飞翔,廖祥文,於志勇,吴运兵,陈国龙. 基于知识图谱的文本观点检索方法[J]. 山东大学学报(理学版), 2016, 51(11): 33-40. |
[5] | 石松1,王明文1,涂伟2,何世柱1. 基于Markov网络团的信息检索扩展模型[J]. J4, 2011, 46(5): 54-57. |
[6] | 徐建民1,3,陈振亚2,崔琰3. 基于用户兴趣及术语间关系的查询扩展方法[J]. J4, 2011, 46(5): 49-53. |
|