您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2022, Vol. 57 ›› Issue (7): 65-72.doi: 10.6040/j.issn.1671-9352.1.2021.032

• • 上一篇    

基于多视角对称非负矩阵分解的跨模态信息检索方法

柳利芳1,马园园2*   

  1. 1.安阳师范学院继续教育学院, 河南 安阳 455000;2.安阳师范学院甲骨文信息处理教育部重点实验室, 河南 安阳 455000
  • 发布日期:2022-06-29
  • 作者简介:柳利芳(1982— ),女,讲师,硕士,研究方向为信息检索、教育技术. E-mail:115422504@qq.com*通信作者简介:马园园(1983— ),男,副教授,博士,研究方向为数据融合与自然语言处理. E-mail:chonghua_1983@126.com
  • 基金资助:
    国家自然科学科基金资助项目(U1804153);教育部人文社科项目(20YJC740042)

Cross-modal information retrieval method based on multi-view symmetric nonnegative matrix factorization

LIU Li-fang1, MA Yuan-yuan2*   

  1. 1. School of Education, Anyang Normal University, Anyang 455000, Henan, China;
    2. The Key Laboratory of Oracle Bone Inscriptions Information Processing of the Education Ministry of China, Anyang Normal University, Anyang 455000, Henan, China
  • Published:2022-06-29

摘要: 针对跨模态信息检索的策略和核心问题,从提升检索性能的角度,分析了多视角对称非负矩阵分解方法用于跨模态检索的优势,提出了一种新的基于对称非负矩阵分解的跨模态检索框架。首先在Wikipedia、Pascal公开数据集上习得一致的子空间表示;然后基于该子空间,设计了一种实时样本在子空间中的投影方法。与典型相关分析、语义匹配和偏最小二乘回归相比,在MAP和PR曲线这2个指标上,本文所提出的方法具有最优的性能表现,表明了该方法应用于跨模态信息检索任务中的潜力。

关键词: 多视角聚类, 对称非负矩阵分解, 跨模态检索, 子空间学习

Abstract: This article summarizes the strategies and core issues in cross-modal information retrieval and analyses the advantages of multi-view symmetric nonnegative matrix factorization for cross-modal retrieval in terms of improving retrieval effect. A new cross-modal retrieval framework based on symmetric non-negative matrix factorization is proposed. Firstly, a consistent subspace representation is learned from the Wikipedia and Pascal datasets. Then, based on the subspace, a method of mapping real-time samples into subspaces is designed. Compared with the canonical correlation analysis, semantic matching and partial least squares regression, the proposed method has the best performance in terms of MAP and PR curves. The results demonstrate that the proposed algorithm has the potential ability in the task of cross-modal information retrieval.

Key words: multi-view clustering, symmetric nonnegative matrix factorization, cross-modal retrieval, subspace learning

中图分类号: 

  • TP391
[1] 丁恒, 陆伟. 基于相关性的跨模态信息检索研究[J]. 现代图书情报技术, 2016,(1):17-23. DING Heng, LU Wei. A study on correlation-based cross-modal information retrieval[J]. New Technology of Library and Information Service, 2016,(1):17-23.
[2] BARNARD K, FORSYTH D. Learning the semantics of words and pictures[C] //Proceedings Eighth IEEE International Conference on Computer Vision. Vancouver: IEEE, 2001: 408-415..
[3] DENOYER L, GALLINARI P. Bayesian network model for semi-structured document classification[J]. Information Processing and Management, 2004, 40(5): 807-827.
[4] SCLAROFF S, CASCIA M L, SETHI S, et al. Unifying textual and visual cues for content-based image retrieval on the world wide web[J]. Computer Vision and Image Understanding, 1999, 75(1/2):86-98.
[5] RASIWASIA N, COSTA PEREIRA J, COVIELLO E, et al. A new approach to cross-modal multimedia retrieval[C] //Proceedings of the 18th ACM International Conference on Multimedia. Firenze: ACM, 2010: 251-260.
[6] 马园园. 基于对称非负矩阵分解的信息融合方法与应用研究[D]. 武汉: 华中师范大学, 2018. MA Yuanyuan. Information fusion methods and application based on symmetric nonnegative matrix factorization[D]. Wuhan: Huazhong Normal University, 2018.
[7] PEREIRA J C, COVIELLO E, DOYLE G, et al. On the role of correlation and abstraction in cross-modal multimedia retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 36(3):521-535.
[8] 冯方向. 基于深度学习的跨模态检索研究[D]. 北京: 北京邮电大学, 2015. FENG Fangxiang. Deep learning for cross-modal retrieval[D]. Beijing: Beijing University of Posts and Telecommunications, 2015.
[9] CHAUDHURI K, KAKADE S M, LIVESCU K, et al. Multi-view clustering via canonical correlation analysis[C] //Proceedings of the 26th Annual International Conference on Machine Learning. Montreal: ACM, 2009: 129-136.
[10] HARDOON D R, SZEDMAK S, SHAWE-TAYLOR J J N C. Canonical correlation analysis: an overview with application to learning methods[J]. Neural Computation, 2004, 16(12):2639-2664.
[11] LIU X, SU L, JIANG D, et al. Cross-modal retrieval of Chinese-CQA based on CCA algorithm[C] //Proceedings of 2018 International Conference on Computational, Modeling, Simulation and Mathematical Statistics. [S.l.] : DEStech, 2018: 326-333.
[12] 李志义, 黄子风, 许晓绵. 基于表示学习的跨模态检索模型与特征抽取研究综述[J]. 情报学报, 2018, 37(4):422-435. LI Zhiyi, HUANG Zifeng, XU Xiaojin. A review of the cross-modal retrieval model and feature extraction based on representation learning[J]. Journal of The China Society for Scientific and Technical Information, 2018, 37(4):422-435.
[13] 邵杰. 基于深度学习的跨模态检索[D]. 北京: 北京邮电大学, 2017. SHAO Jie. Cross-modal retrieval based on deep learning[D]. Beijing: Beijing University of Posts and Telecommunications, 2017.
[14] DHILLON P, FOSTER D P, UNGAR L H. Multi-view learning of word embeddings via cca[C] //Advances in Neural Information Processing Systems. Granada: NeurlIPS, 2011: 199-207.
[15] ZHENG W, ZHOU X, ZOU C, et al. Facial expression recognition using kernel canonical correlation analysis(KCCA)[J]. IEEE Transactions on Neural Networks, 2006, 17(1):233-238.
[16] BACH F R, LANCKRIET G R, JORDAN M I. Multiple kernel learning, conic duality, and the SMO algorithm[C] //Proceedings of the Twenty-first International Conference on Machine Learning. New York: ACM, 2004: 1-8.
[17] RASIWASIA N, MORENO P J, VASCONCELOS N. Bridging the gap: query by semantic example[J]. IEEE Transactions on Multimedia, 2007, 9(5):923-938.
[18] 司守奎,孙兆亮. 数学建模算法与应用[M]. 北京:国防工业出版社, 2015. SI Shoukui, SUN Zhaoliang. Mathematical modeling[M]. Beijing: National Defense Industry Press, 2015.
[19] ROSIPAL R, KRÄMER N. Overview and recent advances in partial least squares[C] //International Statistical and Optimization Perspectives Workshop “Subspace, Latent Structure and Feature Selection”. Bohinj: Springer, 2005: 34-51.
[20] WU Y, WANG S, HUANG Q. Multi-modal semantic autoencoder for cross-modal retrieval[J]. Neurocomputing, 2019, 331:165-175.
[21] XU M, ZHU Z, ZHAO Y, et al. Subspace learning by kernel dependence maximization for cross-modal retrieval[J]. Neurocomputing, 2018, 309:94-105.
[22] KUANG D, YUN S, PARK H. SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering[J]. Journal of Global Optimization, 2015, 62(3):545-574.
[23] KUANG D, DING C, PARK H. Symmetric nonnegative matrix factorization for graph clustering[C] //Proceedings of the 2012 SIAM International Conference on Data Mining. California: SIAM, 2012: 106-117.
[24] ZELNIK-MANOR L, PERONA P. Self-tuning spectral clustering[C] //Advances in Neural Information Processing Systems. Vancouver: NerulIPS, 2005: 1601-1608.
[25] NG A Y, JORDAN M I, WEISS Y. On spectral clustering: analysis and an algorithm[C] //Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic. [S.l.] : MIT Press, 2001: 849-856.
[26] LEE D D, SEUNG H S. Algorithms for non-negative matrix factorization[C] //Neural Information Processing Systems. Vancouver: NeurlIPS, 2001: 556-562.
[27] LONG B, ZHANG Z, YU P S. Co-clustering by block value decomposition[C] //Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. Chicago: ACM, 2005: 635-640.
[28] SHI X, LU H, HE Y, et al. Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization[C] //Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Montreal: ACM, 2015: 541-546.
[29] MA X, GAO L, YONG X, et al. Semi-supervised clustering algorithm for community structure detection in complex networks[J]. Physica A: Statistical Mechanics and Its Applications, 2010, 389(1):187-197.
[30] MA Y, HU X, HE T, et al. Clustering and integrating of heterogeneous microbiome data by joint symmetric nonnegative matrix factorization with Laplacian regularization[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 17(3):788-795.
[31] MA Y, HU X, HE T, et al. Multi-view clustering microbiome data by joint symmetric nonnegative matrix factorization with Laplacian regularization[C] //Bioinformatics and Biomedicine(BIBM). Shenzhen: IEEE, 2016: 625-630.
[32] JIANG X, HU X, XU W. Microbiome data representation by joint nonnegative matrix factorization with Laplacian regularization[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 14(2):353-359.
[33] DU R, DRAKE B, PARK H. Hybrid clustering based on content and connection structure using joint nonnegative matrix factorization[J]. Journal of Global Optimization, 2019, 74(4):861-877.
[34] GUAN Z, ZHANG L, PENG J, et al. Multi-view concept learning for data representation[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(11):3016-3028.
[35] RASHTCHIAN C, YOUNG P, HODOSH M, et al. Collecting image annotations using Amazons mechanical turk[C] //Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazons Mechanical Turk. Honolulu: ACM, 2010: 139-147.
[1] 易三莉,陈建亭,贺建峰. ASR-UNet: 一种基于注意力机制改进的视网膜血管[J]. 《山东大学学报(理学版)》, 2021, 56(9): 13-20.
[2] 王静红,梁丽娜,李昊康,周易. 基于注意力网络特征的社区发现算法[J]. 《山东大学学报(理学版)》, 2021, 56(9): 1-12,20.
[3] 王伟玉, 史存会, 俞晓明, 刘悦, 程学旗. 一种事件粒度的抽取式话题简短表示生成方法[J]. 《山东大学学报(理学版)》, 2021, 56(5): 66-75.
[4] 张一鸣,王国胤,胡军,傅顺. 基于密度峰值和网络嵌入的重叠社区发现[J]. 《山东大学学报(理学版)》, 2021, 56(1): 91-102.
[5] 许侃,刘瑞鑫,林鸿飞,刘海峰,冯娇娇,李家平,林原,徐博. 基于异质网络嵌入的学术论文推荐方法[J]. 《山东大学学报(理学版)》, 2020, 55(11): 35-45.
[6] 张凌,任雪芳. 数据智能分类与分类智能检索-识别[J]. 《山东大学学报(理学版)》, 2020, 55(10): 7-14.
[7] 林明星. 基于变分结构引导滤波的低照度图像增强算法[J]. 《山东大学学报(理学版)》, 2020, 55(9): 72-80.
[8] 王佳麒,杨沐昀,赵铁军,赵臻宇. 检务文书检索数据集的构建[J]. 《山东大学学报(理学版)》, 2020, 55(7): 81-87.
[9] 余鹰,吴新念,王乐为,张应龙. 基于标记相关性的多标记三支分类算法[J]. 《山东大学学报(理学版)》, 2020, 55(3): 81-88.
[10] 温柳英,袁伟. 多标签符号型属性值划分的聚类方法[J]. 《山东大学学报(理学版)》, 2020, 55(3): 58-69.
[11] 张敏情,周能,刘蒙蒙,王涵,柯彦. 基于Paillier的同态加密域可逆信息隐藏[J]. 《山东大学学报(理学版)》, 2020, 55(3): 1-8,18.
[12] 王新乐,杨文峰,廖华明,王永庆,刘悦,俞晓明,程学旗. 基于多维度特征的主题标签流行度预测[J]. 《山东大学学报(理学版)》, 2020, 55(1): 94-101.
[13] 李妮,关焕梅,杨飘,董文永. 基于BERT-IDCNN-CRF的中文命名实体识别方法[J]. 《山东大学学报(理学版)》, 2020, 55(1): 102-109.
[14] 杨亚茹, 王永庆, 张志斌, 刘悦, 程学旗. 基于多元信息融合的用户关联模型[J]. 《山东大学学报(理学版)》, 2019, 54(9): 105-113.
[15] 张迪,查东东,刘华勇. 带两类形状参数的三次λμ-α-DP曲线的构造[J]. 《山东大学学报(理学版)》, 2019, 54(9): 114-126.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!