您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2023, Vol. 58 ›› Issue (12): 10-21.doi: 10.6040/j.issn.1671-9352.2.2022.484

•   • 上一篇    下一篇

一种基于文本语义扩展的记忆网络查询建议模型

张乃洲*(),曹薇   

  1. 河南财经政法大学计算机与信息工程学院, 河南 郑州 450046
  • 收稿日期:2022-12-26 出版日期:2023-12-20 发布日期:2023-12-19
  • 通讯作者: 张乃洲 E-mail:zhangnz@126.com
  • 作者简介:张乃洲(1970—),男,副教授,博士,研究方向为Web检索、人工智能等. E-mail: zhangnz@126.com
  • 基金资助:
    国家自然科学基金资助项目(62072156)

A memory network model based on semantic expansion of text for query suggestion

Naizhou ZHANG*(),Wei CAO   

  1. School of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou 450046, Henan, China
  • Received:2022-12-26 Online:2023-12-20 Published:2023-12-19
  • Contact: Naizhou ZHANG E-mail:zhangnz@126.com

摘要:

提出了一种新的基于文本语义扩展的记忆网络模型, 用于生成环境感知的查询建议。采用基于注意力机制的分层编码器-解码器模型, 利用外部记忆网络, 生成查询与查询相关文档之间的神经注意力向量。模型融合了查询层、会话层和文档层语义信息, 与目前的研究方法相比, 能生成具有更高相关性的环境感知查询建议。使用真实的商业搜索引擎查询日志进行了实验, 实验结果表明了该模型的有效性。

关键词: 查询建议, 文本语义扩展, 环境感知, 记忆网络, 编码器-解码器模型

Abstract:

A novel memory network model based on the semantic expansion of text is proposed to generate context-aware query suggestions. An attention-based hierarchical encoder-decoder model is employed, utilizing an external memory network to generate the neural attention vector between the query and the related document. The model fuses query-layer, session-layer, and document-layer semantic information. Compared with state-of-the-art approaches, our model can generate context-aware query suggestions with higher relevance. Extensive experiments using real commercial search engine query logs demonstrate the effectiveness of the proposed model.

Key words: query suggestion, semantic expansion of text, context-aware, memory network, encoder-decoder model

中图分类号: 

  • TP391

图1

基于文本语义扩展的记忆网络查询建议模型总体架构图"

图2

记忆网络编码器结构图"

表1

数据集信息"

数据集 划分时间区间(年-月-日) 数据集大小
ModelTrainSet 2006-03-01~2006-04-30 2 184 095
L2RTrainSet 2006-05-01~2006-05-14 638 362
ValidSet 2006-05-15~2006-05-24 328 924
TestSet 2006-05-25~2006-05-31 219 890

表2

各种模型生成查询建议的相关性对比实验结果"

模型 BLEU-1 BLEU-2 BLEU-3 BLEU-4
HRED[2] 32.605 9.698 10.278 8.554
Transformer 31.068 7.408 8.947 7.663
SETMN+Transformer 32.102 8.326 9.311 7.832
SETMN+HRED+Att 36.005 11.815 10.071 9.685

表3

MS MARCO数据集上相关性实验结果"

模型 BLEU-1 BLEU-2 BLEU-3 BLEU-4
HRED[2] 39.126 18.426 17.417 13.685
Transformer 36.039 14.075 14.315 13.793
SETMN+Transformer 38.522 17.485 16.994 14.231
SETMN+HRED+Att 43.566 25.048 17.117 16.185

表4

排序学习数据集信息"

数据集 数据集大小
L2RTrainSet* 19 108
ValidSet* 13 189
TestSet* 10 288

表5

各种排序学习模型排序查询建议的MRR"

模型 MRR
ADJ[2] 0.511 4
Baseline Ranker(BR)[2] 0.545 9
HRED+BR[2] 0.553 3
Transformer+BR 0.548 6
SETMN+Transformer+BR 0.550 1
SETMN+HRED+Att+BR 0.566 8

表6

各类模型生成的查询建议例子"

# 查询环境+尾查询不同模型生成的查询建议
ADJ HRED Transformer SETMN+Transformer SETMN+HRED+Att
1 verizoncom verizon wireless verizon wireless verizon wireless verizon wireless
verizonwirelesscom verizon phone service verizon phone next com best buy
verizon wireless verizon home phone singular wireless verizon ringtones circuit city
wwwverizoncom verizon central com aol phone verizon phone comcast net
2 frontier airlines com frontier airlines com frontier airlines southwest airlines continental airlines
frontier airlines america west airlines northwest airlines delta airlines frontier airlines
unitedairlinescom united airlines delta airlines northwest airlines southwest airlines
southwestairlinescom southwest airlines alaska airlines airline tickets delta airlines
3 aol browser aol media player internet explorer aol browser aol browser
aol cookes aol browser aol browser windows media browser settings
google aol live help aol explorer aol spyware web unlock
setup windows media player aol upgrade aol antivirus microsoft internet browser
4 baseball bats baseball bats baseball scores sporting goods pottery barn
training aids for baseball baseball glasses baseball bats baseball bats baseball tickets
batting gloves baseball warehouse com sports authority baseball hats white sox
animated bats baseball tools com baseball shoes baseball players dickssportinggoods com
5 enterprise national car rental car rental kelly rentacar toyota financial
hertz enterprise car rental avis car car rental boston city toyota
budget car rental rates avis rental auto repair car rentals
alamo budget car rental enterprise rental auto insurance leader toyota
6 map of australia us map www map united states map of america
maps of nigra falls united airlines united airlines map of canada
united states map www map weather channel list of hotels
world atlas cheap flights washington state falls of canada
7 balls fitness american idol total fitness gold gym fitness
fitness equipment american eagle american eagle weekend black fitness
fitness clubs best buy fitness equipment german wet spa
fitness arts outlet home depot total gym tv fitness for children
8 toyota cars new toyota cars mercedes benz used cars toyota dealers
nissan cars nissan dealers used cars toyota corolla honda dealers
new cars new dodge toyota suv toyota cars used toyota dealers
toyota deals nissan cars toyota dealers nissan dealers top car dealers
1 ARAMPATZIS Avi, KAMPS Jaap. A study of query length[C]//Proceedings of the 31th Annual International ACM SIGIR Conference (SIGIR). Singapore: ACM, 2008: 811-812.
2 SORDONI Alessandro, BENGIO Yoshua, VAHABI Hossein, et al. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM). Melbourne: ACM, 2015: 553-562.
3 MEI Qiaozhu, ZHOU Dengyong, CHURCH Kenneth. Query suggestion using hitting time[C]//Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). California: ACM, 2008: 469-478.
4 CAO Huanhuan, JIANG Daxin, PEI Jian, et al. Context-aware query suggestion by mining click-through and session data[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD). Las Vegas: ACM, 2008: 875-883.
5 HU Hao, ZHANG Mingxi, HE Zhenying, et al. Diversifying query suggestions by using topics from wikipedia[C]//ACM International Conferences on Web Intelligence(WI). Atlanta: ACM, 2013: 139-146.
6 JAIN Alpa, OZERTEM Umut, VELIPASAOGLU Emre. Synthesizing high utility suggestions for rare web search queries[C]//Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR). Beijing: ACM, 2011: 805-814.
7 SHOKOUHI Milad. Learning to personalize query auto-completion[C]//Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR). Dublin: ACM, 2013: 103-112.
8 JIANG J Y, KE Y Y, CHIEN P Y, et al. Learning user reformulation behavior for query auto-completion[C]//Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR). Gold Coast: ACM, 2014: 445-454.
9 HE Qi, JIANG Daxin, LIAO Zhen, et al. Web query recommendation via sequential query prediction[C]//Proceedings of the 25th International Conference on Data Engineering(ICDE). Shanghai: IEEE, 2009: 1443-1454.
10 BAHDANAU Dzmitry, CHO Kyunghyun, BENGIO Yoshua. Neural machine translation by jointly learning to align and translate[C]//The 3rd International Conference on Learning Representations(ICLR). San Diego: ICLR, 2015.
11 VASWANI Ashish, SHAZEER Noam, PARMAR Niki, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30(NIPS). Long Beach: MIT, 2017: 5998-6008.
12 AHMAD Uddin Wasi, CHANG Kaiwei, WANG Hongning. Context attentive document ranking and query suggestion[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR). Paris: ACM, 2019: 385-394
13 HAN F X, NIU D, LAI K F, et al. Inferring search queries from web documents via a graph-augmented sequence to attention network[C]//Proceedings of the 2019 World Wide Web Conference on World Wide Web (WWW). San Francisco: ACM, 2019: 2792-2798.
14 ZHONG Jianling, GUO Weiwei, GAO Huiji, et al. Personalized query suggestions[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Paris: ACM, 2020: 1645-1648.
15 MUSTAR Agnès, LAMPRIER Sylvain, PIWOWARSKI Benjamin. On the study of transformers for query suggestion[J]. ACM Transactions on Information Systems, 2022, 40(1): 18: 1-18: 27.
16 YU Shi, LIU Jiahua, YANG Jingqin, et al. Few-shot generative conversational query rewriting[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR). [S.l.]: ACM, 2020: 1933-1936.
17 NOGUEIRA Rodrigo, YANG Wei, LIN Jimmy, et al. Document expansion by query prediction[J/OL]. arXiv. 2019. http://arxiv.org/abs/1904.08375.
18 DAI Zhuyun, CALLAN Jamie. Deeper text understanding for IR with contextual neural language modeling[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR). Paris: ACM, 2019: 985-988.
19 YANG Wei, ZHANG Haotian, LIN Jimmy. Simple applications of BERT for ad hoc document retrieval[J/OL]. arXiv. 2019. https://arxiv.org/abs/1903.10972.
20 WESTON Jason, CHOPRA Sumit, BORDES Antoine. Memory networks[C]//The 3rd International Conference on Learning Representations (ICLR). San Diego: ICLR, 2015.
21 CHEN Peng, SUN Zhongqian, BING Lidong, et al. Recurrent attention network on memory for aspect sentiment analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing(EMNLP). Copenhagen Copenhagen: [s. n. ], 2017: 452-461.
22 TANG Duyu, QIN Bing, LIU Ting. Aspect level sentiment classification with deep memory network[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP). Austin: MIT, 2016: 214-224.
23 PAPINENI Kishore, ROUKOS Salim, WARD Todd, et al. Bleu: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002). Philadelphia: MIT, 2002: 311-318.
24 HAGEN Matthias, POTTHAST Martin, BEYER Anna, et al. Towards optimum query segmentation: in doubt without[C]//Proceedings of the 21st ACM International on Conference on Information and Knowledge Management (CIKM). Maui: ACM, 2012: 1015-1024.
[1] 吴洁,朱小飞,张宜浩,龙建武,黄贤英,杨武. 基于用户情感倾向感知的微博情感分析方法[J]. 《山东大学学报(理学版)》, 2019, 54(3): 46-55.
[2] 陈桂英. 带有阈值的广义模糊双向联想记忆#br# 网络的稳定性分析[J]. 山东大学学报(理学版), 2014, 49(1): 80-85.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 杨军. 金属基纳米材料表征和纳米结构调控[J]. 山东大学学报(理学版), 2013, 48(1): 1 -22 .
[2] 何海伦, 陈秀兰*. 变性剂和缓冲系统对适冷蛋白酶MCP-01和中温蛋白酶BP-01构象影响的圆二色光谱分析何海伦, 陈秀兰*[J]. 山东大学学报(理学版), 2013, 48(1): 23 -29 .
[3] 赵君1,赵晶2,樊廷俊1*,袁文鹏1,3,张铮1,丛日山1. 水溶性海星皂苷的分离纯化及其抗肿瘤活性研究[J]. J4, 2013, 48(1): 30 -35 .
[4] 孙小婷1,靳岚2*. DOSY在寡糖混合物分析中的应用[J]. J4, 2013, 48(1): 43 -45 .
[5] 罗斯特,卢丽倩,崔若飞,周伟伟,李增勇*. Monte-Carlo仿真酒精特征波长光子在皮肤中的传输规律及光纤探头设计[J]. J4, 2013, 48(1): 46 -50 .
[6] 杨伦,徐正刚,王慧*,陈其美,陈伟,胡艳霞,石元,祝洪磊,曾勇庆*. RNA干扰沉默PID1基因在C2C12细胞中表达的研究[J]. J4, 2013, 48(1): 36 -42 .
[7] 冒爱琴1, 2, 杨明君2, 3, 俞海云2, 张品1, 潘仁明1*. 五氟乙烷灭火剂高温热解机理研究[J]. J4, 2013, 48(1): 51 -55 .
[8] 杨莹,江龙*,索新丽. 容度空间上保费泛函的Choquet积分表示及相关性质[J]. J4, 2013, 48(1): 78 -82 .
[9] 李永明1, 丁立旺2. PA误差下半参数回归模型估计的r-阶矩相合[J]. J4, 2013, 48(1): 83 -88 .
[10] 董伟伟. 一种具有独立子系统的决策单元DEA排序新方法[J]. J4, 2013, 48(1): 89 -92 .