您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2017, Vol. 52 ›› Issue (1): 29-36.doi: 10.6040/j.issn.1671-9352.2.2015.E49

• • 上一篇    下一篇

融合局部共现和上下文相似度的查询扩展方法

唐亮,赵晓峰,席耀一,易绵竹*   

  1. 解放军外国语学院, 河南 洛阳 471003
  • 收稿日期:2015-09-01 出版日期:2017-01-20 发布日期:2017-01-16
  • 通讯作者: 易绵竹(1963— ),男,博士,教授,研究方向为计算语言学. E-mail:13373781261@163.com E-mail:tl-wy@163.com
  • 作者简介:唐亮(1976— ),男,博士,讲师,研究方向为信息检索、知识图谱. E-mail:tl-wy@163.com
  • 基金资助:
    国家重点基础研究发展计划(973计划)项目(2014CB340400,2012CB316303);国家自然科学基金重点项目(61232010);国家自然科学基金面上项目(61173064);国家科技支撑计划(2012BAH39B04)

The method of query expansion based on local co-occurrence and context similarity

TANG Liang, ZHAO Xiao-feng, XI Yao-yi, YI Mian-zhu*   

  1. Foreign Languages College of Chinese Peoples Liberation Army, Luoyang 471003, Henan, China
  • Received:2015-09-01 Online:2017-01-20 Published:2017-01-16

摘要: 为解决信息检索中用户查询可能与索引文档信息表示不匹配从而影响检索效果的问题,提出一种融合局部共现和上下文相似度的查询扩展方法,从与查询词具有共现关系的邻接词和与查询词具有高相关性或同指关系的词两个方面对用户输入查询词进行扩展,重点测试邻接词的取词窗口大小以及上下文向量的最优长度。试验表明:与采用单一扩展方法相比,融合方法的平均准确率取得了明显提高,当邻接词的窗口大小取5,上下文向量的长度取15时,具有更好的平均准确率。

关键词: 局部共现, 邻接词, 上下文, 查询扩展

Abstract: In order to solve the mismatch between user query and representation of index document, which affects the performance of information retrieval, this paper proposes a query expansion method based on the integration of local co-occurrence and context similarity. The method expands the user query through the following two ways, one is the adjacent words which have co-occurrence relationship with the query words, the other is the similar words which have a high correlation with the query words. The method focuses on testing the influence of the adjacent words window size and the optimal length of context vectors. Experimental results show that compared with the single expansion method, our method can improve the average accuracy obviously, and the average accuracy reaches the highest when the window size of adjacent words is 5, and the length of the context vector is 15.

Key words: local co-occurrence, adjacent word, context, query expansion

中图分类号: 

  • TP391
[1] 李卫疆,赵铁军,王宪刚.基于上下文的查询扩展[J].计算机研究与发展,2010,47(2):300-304. ZHAO Weijiang, ZHAO Tiejun, WANG Xiangang. Context-sensitive query expansion[J]. Journal of Computer Research and Development, 2010, 47(2):300-304.
[2] 丁晓渊,顾春华,王明永.基于查询日志的局部共现查询扩展[J].计算机应用与软件,2013,30(12):22-27. DING Xiaoyuan, GU Chunhua, WANG Mingyong. Query expansion of local co-occurrence based on query log[J]. Computer Applications and Software, 2013, 30(12):22-27.
[3] 王水利,黄广君,霍亚格.基于语义分析的查询扩展方法[J].计算机工程,2011,37(16):77-79. WANG Shuili, HUANG Guangjun, HUO Yage. Query expansion method based on semantic analysis[J]. Computer Engineering, 2011, 37(16):77-79.
[4] 胡哲,朱强.基于本体的查询扩展研究[J].电脑知识与技术,2010(2):1025-1026. HU Zhe,ZHU Qiang. The study of ontology-based query expansion[J]. Computer Knowledge and Technology, 2010(2):1025-1026.
[5] 胡川洌,符云清,钟明洋.基于领域本体的语义查询扩展[J].计算机系统应用,2012,21(7):83-89. HU Chuanlie, FU Yunqing, ZHONG Mingyang. Semantic query expansion based on domain ontology[J]. Computer System Applications, 2012, 21(7):83-89.
[6] 黄名选,陈燕红,张师超.基于关联规则挖掘的查询扩展模型研究[J].现代图书情报技术,2007(10):47-51. HUANG Mingxuan, CHEN Yanhong, ZHANG Shichao. Study on query expansion model based on association rules mining[J]. New Technology of Library and Information Service, 2007(10):47-51.
[7] 黄名选,黄发良.一种基于词间关联规则挖掘的查询扩展方法[J].图书情报工作,2008(3):132-134. HUANG Mingxuan, HUANG Faliang. An algorithm of query expansion based on association rules between terms[J]. Library And Information Service, 2008(3):132-134.
[8] 黄名选,严小卫,张师超.基于矩阵加权关联规则挖掘的伪相关反馈查询扩展[J].软件学报,2009,20(7):1854-1865. HUANG Mingxuan,YAN Xiaowei, ZHANG Shichao. Query expansion of pseudo relevance feedback based on matrix-weighted association rules mining[J]. Journal of Software, 2009, 20(7):1854-1865.
[9] 魏露,李书琴,李伟男,等.跨语言查询扩展优化[J].计算机工程与设计,2014,35(8): 2785-2788. WEI Lu, LI Shuqin, LI Weinan, et al. Optimization of cross-language query expansion[J]. Computer Engineering and Design, 2014, 35(8): 2785-2788.
[10] 何燕.基于用户反馈的查询扩展研究[J].情报理论与实践,2013(8):81-84. HE Yan. The study of query expansion based on user feedback[J]. Information Studies: Theory & Application, 2013(8):81-84.
[11] 王旭阳,萧波.基于本体和局部上下文的查询扩展方法[J].计算机工程,2012,38(7):57-59. WANG Xuyang, XIAO Bo. Query expansion method based on ontology and local context analysis[J]. Computer Engineering, 2012, 38(7):57-59.
[12] 熊忠阳,向海燕,张玉芳.结合用户日志的局部上下文分析方法[J].计算机工程与应用,2012,48(12):74-77. XIONG Zhongyang, XIANG Haiyan, ZHANG Yufang. Local context analysis approach combined with user log[J]. Computer Engineering and Applications, 2012, 48(12):74-77.
[13] 徐建民,崔琰,刘清江.基于同义词关系改进的局部共现查询扩展[J].情报杂志,2010(9):145-147. XU Jianmin, CUI Yan, LIU Qingjiang. Improved local co-occurrence query expansion based on synonymous[J]. Journal of Intelligence, 2010(9):145-147.
[14] 朱鲲鹏,魏芳.基于用户日志挖掘的查询扩展方法[J].计算机应用与软件,2012,29(6):113-117. ZHU Kunpeng, WEI Fang. A new query expansion method based on user logs mining[J]. Computer Applications and Software, 2012, 29(6):113-117.
[15] 李泽军,曾利军,刘文华.基于相关性和语义相似度融合的查询扩展方法[J].计算机技术与发展,2010,20(9):66-68. LI Zejun, ZENG Lijun, LIU Wenhua. Query expansion method based on relativity and similarity inosculate[J]. Computer Technology and Development, 2010, 20(9):66-68.
[16] 苏俊杰,陈俊.基于半监督学习的查询扩展模型[J].计算机系统应用,2012,21(3):181-184. SU Junjie, CHEN Jun. Query expansion model based on semi-supervised learning[J]. Computer System Applications, 2012, 21(3):181-184.
[17] 胡炜,徐青翠,樊中奎.基于用户日志双向聚类的查询扩展方法[J].数字技术与应用,2011(12):233-234. HU Wei, XU Qingcui, FAN Zhongkui. Query expansion model based on two-way clustering of user logs[J]. Digital Technology and Applications, 2011(12):233-234.
[18] 任永功,范丹,武佳林.基于语义关联树的分类查询扩展算法[J].计算机科学,2009,36(9):238-241. REN Yonggong, FAN Dan, WUJialin. Classified query expansion algorithm based on semantic relation tree[J]. Computer Science, 2009, 36(9):238-241.
[19] 方飞.基于语义分析和局部文档的查询扩展研究[D].武汉:华中科技大学,2013. FANG Fei. Query expansion based on semantic analysis and local documents[D]. Wuhan: Huazhong University of Science & Technology, 2013.
[20] 胡保祥.基于查询日志的查询扩展研究[D].北京:北京邮电大学,2013. HU Baoxiang. Research of query expansion based on crowdsourcing[D]. Beijing: Beijing University of Posts and Telecommunications, 2013.
[21] 戚璐瑶.一种基于关联规则挖掘的查询扩展算法及应用研究[D].南京:南京航空航天大学,2012. QI Luyao. A kind of query expansion algorithm based on association rule mining and research on application[D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2012.
[22] 孙天倍.跨语言信息检索的查询消歧及查询扩展技术研究[D].内蒙古:内蒙古大学,2013. SUN Tianpei. Research of query disambiguation and query expansion based on cross-language information retrieval[D]. NeiMenggu: Inner Mongolia University, 2013.
[1] 徐也,徐蔚然. 基于语义特征扩展的知识库增量引文推荐算法[J]. 山东大学学报(理学版), 2016, 51(11): 26-32.
[2] 马飞翔,廖祥文,於志勇,吴运兵,陈国龙. 基于知识图谱的文本观点检索方法[J]. 山东大学学报(理学版), 2016, 51(11): 33-40.
[3] 周加强. 智能普适个人流程服务上下文构建方法[J]. J4, 2012, 47(7): 44-49.
[4] 徐建民1,3,陈振亚2,崔琰3. 基于用户兴趣及术语间关系的查询扩展方法[J]. J4, 2011, 46(5): 49-53.
[5] 石松1,王明文1,涂伟2,何世柱1. 基于Markov网络团的信息检索扩展模型[J]. J4, 2011, 46(5): 54-57.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!