JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2017, Vol. 52 ›› Issue (9): 13-18.doi: 10.6040/j.issn.1671-9352.0.2016.107

Previous Articles     Next Articles

Study on boundary detection of users query intents

WANG Kai, HONG Yu*, QIU Ying-ying, WANG Jian, YAO Jian-min, ZHOU Guo-dong   

  1. School of Computer Science and Technology Soochow University, Suzhou 215006, Jiangsu, China
  • Received:2016-11-25 Online:2017-09-20 Published:2017-09-15

Abstract: In generally, several query requests will be submit by user to capture specific query intent. It is quite a meaningful work to detect the boundary among continuous query requests effectively, which could help search engine to understand the query intent completely. Moreover, identifying the integrated query intent is considerable helpful to query suggestion, query expansion and the construction of user profile. On the basis of fully analyzing the features mentioned from previous research, this paper proposed topic distribution-based similarity and this similarity is effective with SVM model and CRF model. The results show that, with topic distribution similarity, F-measure is improved by 2% in comparison to the baseline system.

Key words: information retrieval, boundary detection, query intent

CLC Number: 

  • TP391
[1] SILVERSTEIN C, MARAIS H, HENZINGER M, et al. Analysis of a very large web search engine query log[J]. SIGIR Forum, 1999, 33(1):6-12.
[2] LI Yanan, ZHANG Sen, WANG Bin, et al. Characteristics of chinese web searching: A large-scale analysis of chinese query logs[J]. Journal of Computational Information Systems, 2008, 4(3):1127-1136.
[3] 余慧佳, 刘奕群, 张敏,等. 基于大规模日志分析的搜索引擎用户行为分析[J]. 中文信息学报, 2007, 21(1):109-114. YU Huijia, LIU Yiqun, ZHANG Min, et al. Research in search engine user behavior based on log analysis[J]. Journal of Chinese Information Processing, 2007, 21(1):109-114.
[4] BRODER A. A taxonomy of web search[J]. SIGIR Forum, 2002, 36(2):3-10.
[5] 江雪, 孙乐. 用户查询意图切分的研究[J]. 计算机学报, 2013, 36(3):664-670. JIANG Xue, SUN Le. Study on segmentation of users query intents[J]. Chinese Journal of Computers, 2013, 36(3):664-670.
[6] HE Daqing, GÖKER A, HARPER D J. Combining evidence for automatic web session identification[J]. Information Processing & Management, 2002, 38(5):727-742.
[7] JANSESN B J, SPINK A, BLAKELY C, et al. Defining a session on web search engines[J]. Journal of the American Society for Information Science and Technology, 2007, 58(6):862-871.
[8] DOWNEY D, DUMAIS S, HORVITZ E. Models of searching and browsing: languages, studies, and applications[C] //Proceedings of the International Joint Conference on Artificial Intelligence. Hyderabad: ACM, 2007:1465-1472.
[9] NIKOLAI B, BERNARD J B J. Limits of the web log analysis artifacts[C] //Proceedings of Workshop on Logging Traces of Web Activity. Edinburgh:World Wide Web Conference, 2006:152-156.
[10] MURRAY G C, LIN J, CHOWDHURY A. Identification of user session with hierarchical agglomerative clustering[J]. Journal of American Society for Information Science, 2006, 43(1):1-9.
[11] OZMUTLU H C, CAVDUR F. Application of automatic topic identification on excite web search engine data logs[J]. Information Processing and Management, 2005, 41(5):1243-1262.
[12] OZMUTLU S, CAVDUR F. Neural network applications for automatic new topic identification[J]. Online Information Review, 2005, 29(1):34-53.
[13] OZMUTLU S, OZMUTLU H C, SPINK A. Automatic new topic identification in search engine transaction logs using multiple linear regression[J].Hawaii International Conference on System Sciences, 2008, 16(3): 140.
[14] OZMUTLU S, OZMUTLU H C, BUYUK B. Using Monte-Carlo simulation for automatic new topic identification of search engine transaction logs[J]. Winter Simulation Conference, 2007, 16(5): 2306-2314.
[15] LI Xiao, WANG Yeyi, ALEX A. Learning query intent from regularized click graphs[C] //Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval.New York: ACM, 2008:339-346.
[1] CAO Rong, HUANG Jin-zhu, YI Mian-zhu. Information retrieval: the final direction of human language technology research in DARPA [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(9): 11-17.
[2] MENG Ye, ZHANG Peng, SONG Da-wei. Study on collection statistics for parameter selection in pseudo relevance feedback [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(7): 18-22.
[3] LI Sheng-dong, LÜ Xue-qiang, SUN Jun, SHI Shui-cai. Improvement of Lucene full-text indexing efficiency [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(07): 76-79.
[4] XU Jie-ping1, YIN Hong-yu1, FAN Zi-wen2. Study on cover songs identification based on phrase content [J]. J4, 2013, 48(7): 68-71.
[5] SUN Jing-yu, CHEN Jun-jie, YU Xue-li, LI Xian-hua. A survey of collaborative Web search [J]. J4, 2011, 46(5): 9-15.
[6] PANG Guan-song, ZHANG Li-sha, JIANG Sheng-yi*, KUANG Li-min, WU Mei-ling. A multi-level clustering approach based on noun phrases for search results [J]. J4, 2010, 45(7): 39-44.
[7] WANG Tai-feng,Yuan Ping-bo,JIA Ji-min,Yu Meng-hai . Portrait retrieval based on news environment [J]. J4, 2006, 41(3): 5-10 .
[8] CAO Ying,WANG Ming-wen,TAO Hong-liang . Information retrieval model based on Markov Network [J]. J4, 2006, 41(3): 126-130 .
[9] WANG Wei-dong,SONG Dan,SONG Ren-jie . Web news retrieval based on splited vector space model [J]. J4, 2006, 41(3): 135-138 .
[10] HE Jing . An approach to generate boolean query in question andanswering retrieval system [J]. J4, 2006, 41(3): 13-17 .
[11] SONG Chun-fang,SHI Bing . An algorithm to cluster the search results basedon the association rules [J]. J4, 2006, 41(3): 61-65 .
[12] GAO Xiang,WANG Min . Applying fuzzy cluster algorithm to Web information retrieval [J]. J4, 2006, 41(3): 11-12 .
[13] WAN Hai-ping,HE Hua-can . Dimensionality reduction based on spectral graph and its application [J]. J4, 2006, 41(3): 58-60 .
[14] HU Jungang,DONG Shou-bin,CHEN Xiao-zhi,ZHANG Yuan-feng . Entry page search algorithm based on URLtype prior probabilities [J]. J4, 2006, 41(3): 76-80 .
[15] FU Xue-feng,LIU Qiu-yun,WANG Ming-wen . Rough sets information retrieval model based on multual information [J]. J4, 2006, 41(3): 116-119 .
Full text



No Suggested Reading articles found!