山东大学学报(理学版) ›› 2018, Vol. 53 ›› Issue (3): 46-53.doi: 10.6040/j.issn.1671-9352.1.2017.001
张芳芳1,2*,曹兴超1,2
ZHANFG Fang-fang1,2*, CAO Xing-chao1,2
摘要: 提出了一种基于字面相关性匹配和语义相关性匹配的深度神经网络模型,用来计算信息检索中查询和文档的匹配得分。字面相关性匹配模型基于查询和文档之间的词共现矩阵,主要考虑查询和文档的字面匹配信息以及匹配词的位置信息;语义相关性匹配模型基于预训练的词向量,进一步通过卷积神经网络提取查询和文档之间不同位置的语义匹配信息,最后的匹配得分是这两个子模型的叠加。损失函数采用hinge loss,通过最大化正负样本之间的分数差来更新参数。实验结果表明,模型在验证集上的NDCG@3和NDCG@5分别可以达到0.790 4和0.818 3,相对于BM25以及单个的字面匹配或者语义匹配模型来说都有很大的提升,这也验证了字面匹配和语义匹配对于信息检索的重要性。
中图分类号:
[1] ROBERTSON S, ZARAGOZA H. The probabilistic relevance framework: BM25 and beyond[J]. Foundations & Trends® in Information Retrieval, 2009, 3(4):333-389. [2] HUANG P S, HE X, GAO J, et al. Learning deep structured semantic models for web search using clickthrough data[C] // Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. New York: ACM, 2013: 2333-2338. [3] MITRA B, DIAZ F, CRASWELL N. Learning to match using local and distributed representations of text for web search[C] // Proceedings of the 26th International Conference on World Wide Web. Sweden: International World Wide Web Conferences Steering Committee, 2017: 1291-1299. [4] MITRA B, CRASWELL N. Neural models for information retrieval[J]. arXiv Preprint, 2017, arXiv: 1705.01509. [5] MCCLELLAND J L, RUMELHART D E, PDP Research Group. Parallel distributed processing[M]. Cambridge, MA: MIT Press, 1987. [6] PONTE J M, CROFT W B. A language modeling approach to information retrieval[C] // Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. New York: ACM, 1998: 275-281. [7] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26:3111-3119. [8] ZHENG G, CALLAN J. Learning to reweight terms with distributed representations[C] // Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2015: 575-584. [9] NALISNICK E, MITRA B, CRASWELL N, et al. Improving document ranking with dual word embeddings[C] // Proceedings of the 25th International Conference Companion on World Wide Web. Sweden: International World Wide Web Conferences Steering Committee, 2016: 83-84. [10] DIAZ F, MITRA B, CRASWELL N. Query expansion with locally-trained word embeddings[J]. arXiv Preprint, 2016, arXiv:1605.07891. [11] PANG L, LAN Y, GUO J, et al. Text matching as image recognition[C] // Thirtieth AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI. 2016: 2793-2799. [12] PANG L, LAN Y, GUO J, et al. A study of matchpyramid models on ad-hoc retrieval[J]. arXiv Preprint, 2016, arXiv:1606.04648. [13] SHEN Y, HE X, GAO J, et al. A latent semantic model with convolutional-pooling structure for information retrieval[C] // Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. New York: ACM, 2014: 101-110. [14] RAO J, HE H, LIN J. Noise-contrastive estimation for answer selection with deep neural networks[C] // Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York: ACM, 2016: 1913-1916. [15] CARUANA R, LAWRENCE S, GILES C L. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping[C] // Advances in neural information processing systems. Cambridge: MIT Press, 2001: 402-408. [16] GUOJ, FAN Y, AI Q, et al. A deep relevance matching model for ad-hoc retrieval[C] // Proceedings of the 25th ACM International on Conference on; Information and Knowledge Management. New York: ACM, 2016: 55-64. [17] SEO M, KEMBHAVI A, Farhadi A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv Preprint, 2016, arXiv: 1611.01603. |
[1] | 秦静,林鸿飞,徐博. 基于示例语义的音乐检索模型[J]. 山东大学学报(理学版), 2017, 52(6): 40-48. |
|