您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2018, Vol. 53 ›› Issue (3): 46-53.doi: 10.6040/j.issn.1671-9352.1.2017.001

• • 上一篇    下一篇

基于字面和语义相关性匹配的智能篇章排序

张芳芳1,2*,曹兴超1,2   

  1. 1.北京大学信息科学技术学院, 北京 100871;2.北京大学计算中心, 北京 100871
  • 收稿日期:2017-09-07 出版日期:2018-03-20 发布日期:2018-03-13
  • 通讯作者: 张芳芳(1994— ),女,硕士,研究方向为自然语言处理. E-mail:ffz@pku.edu.cn E-mail:ffz@pku.edu.cn并列第一作者简介:曹兴超(1992— ),男,硕士,研究方向为推荐系统. E-mail:caoxingchao@pku.edu.cn
  • 作者简介:张芳芳(1994— ),女,硕士,研究方向为自然语言处理. E-mail:ffz@pku.edu.cn并列第一作者简介:曹兴超(1992— ),男,硕士,研究方向为推荐系统. E-mail:caoxingchao@pku.edu.cn

Lexical and semantic relevance matching based neural document ranking

ZHANFG Fang-fang1,2*, CAO Xing-chao1,2   

  1. School of Information Science and Technology, Peking University, Beijing 100871, China;
    2. Computer Center, Peking University, Beijing 100871, China
  • Received:2017-09-07 Online:2018-03-20 Published:2018-03-13

摘要: 提出了一种基于字面相关性匹配和语义相关性匹配的深度神经网络模型,用来计算信息检索中查询和文档的匹配得分。字面相关性匹配模型基于查询和文档之间的词共现矩阵,主要考虑查询和文档的字面匹配信息以及匹配词的位置信息;语义相关性匹配模型基于预训练的词向量,进一步通过卷积神经网络提取查询和文档之间不同位置的语义匹配信息,最后的匹配得分是这两个子模型的叠加。损失函数采用hinge loss,通过最大化正负样本之间的分数差来更新参数。实验结果表明,模型在验证集上的NDCG@3和NDCG@5分别可以达到0.790 4和0.818 3,相对于BM25以及单个的字面匹配或者语义匹配模型来说都有很大的提升,这也验证了字面匹配和语义匹配对于信息检索的重要性。

关键词: 字面相关性, 卷积神经网络, 语义相关性, 词共现矩阵

Abstract: A deep neural network based on lexical correlation matching and semantic correlation matching is proposed, which can be used to calculate the matching score of a query and a document in the information retrieval task. The lexical relevance matching model is based upon the word co-occurrence matrix of a query and a document, which takes the word matching information into consideration, so as to consider the position information of the matching word. The semantic relevance matching model is grounded in pre-trained word vector, then the convolution network extracts the semantic matching information between a query and different positions of the documents, where the final matching score is the superposition of the two sub-models. Model parameters are updated in the training process by maximizing the fractional difference between positive and negative samples. Experimental results indicate that the NDCG@3 and NDCG@5 of the model can attain to 0.790 4 and 0.818 3 respectively on the validation set. which significantly outperforms the baselines, verifying the importance of word and semantic matching for information retrieval.

Key words: semantic matching, convolution network, co-occurrence matrix, lexical matching

中图分类号: 

  • TP389.1
[1] ROBERTSON S, ZARAGOZA H. The probabilistic relevance framework: BM25 and beyond[J]. Foundations & Trends® in Information Retrieval, 2009, 3(4):333-389.
[2] HUANG P S, HE X, GAO J, et al. Learning deep structured semantic models for web search using clickthrough data[C] // Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. New York: ACM, 2013: 2333-2338.
[3] MITRA B, DIAZ F, CRASWELL N. Learning to match using local and distributed representations of text for web search[C] // Proceedings of the 26th International Conference on World Wide Web. Sweden: International World Wide Web Conferences Steering Committee, 2017: 1291-1299.
[4] MITRA B, CRASWELL N. Neural models for information retrieval[J]. arXiv Preprint, 2017, arXiv: 1705.01509.
[5] MCCLELLAND J L, RUMELHART D E, PDP Research Group. Parallel distributed processing[M]. Cambridge, MA: MIT Press, 1987.
[6] PONTE J M, CROFT W B. A language modeling approach to information retrieval[C] // Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. New York: ACM, 1998: 275-281.
[7] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26:3111-3119.
[8] ZHENG G, CALLAN J. Learning to reweight terms with distributed representations[C] // Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2015: 575-584.
[9] NALISNICK E, MITRA B, CRASWELL N, et al. Improving document ranking with dual word embeddings[C] // Proceedings of the 25th International Conference Companion on World Wide Web. Sweden: International World Wide Web Conferences Steering Committee, 2016: 83-84.
[10] DIAZ F, MITRA B, CRASWELL N. Query expansion with locally-trained word embeddings[J]. arXiv Preprint, 2016, arXiv:1605.07891.
[11] PANG L, LAN Y, GUO J, et al. Text matching as image recognition[C] // Thirtieth AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI. 2016: 2793-2799.
[12] PANG L, LAN Y, GUO J, et al. A study of matchpyramid models on ad-hoc retrieval[J]. arXiv Preprint, 2016, arXiv:1606.04648.
[13] SHEN Y, HE X, GAO J, et al. A latent semantic model with convolutional-pooling structure for information retrieval[C] // Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. New York: ACM, 2014: 101-110.
[14] RAO J, HE H, LIN J. Noise-contrastive estimation for answer selection with deep neural networks[C] // Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York: ACM, 2016: 1913-1916.
[15] CARUANA R, LAWRENCE S, GILES C L. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping[C] // Advances in neural information processing systems. Cambridge: MIT Press, 2001: 402-408.
[16] GUOJ, FAN Y, AI Q, et al. A deep relevance matching model for ad-hoc retrieval[C] // Proceedings of the 25th ACM International on Conference on; Information and Knowledge Management. New York: ACM, 2016: 55-64.
[17] SEO M, KEMBHAVI A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv Preprint, 2016, arXiv: 1611.01603.
[1] 秦静,林鸿飞,徐博. 基于示例语义的音乐检索模型[J]. 山东大学学报(理学版), 2017, 52(6): 40-48.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!