JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2022, Vol. 57 ›› Issue (7): 65-72.doi: 10.6040/j.issn.1671-9352.1.2021.032

Previous Articles    

Cross-modal information retrieval method based on multi-view symmetric nonnegative matrix factorization

LIU Li-fang1, MA Yuan-yuan2*   

  1. 1. School of Education, Anyang Normal University, Anyang 455000, Henan, China;
    2. The Key Laboratory of Oracle Bone Inscriptions Information Processing of the Education Ministry of China, Anyang Normal University, Anyang 455000, Henan, China
  • Published:2022-06-29

Abstract: This article summarizes the strategies and core issues in cross-modal information retrieval and analyses the advantages of multi-view symmetric nonnegative matrix factorization for cross-modal retrieval in terms of improving retrieval effect. A new cross-modal retrieval framework based on symmetric non-negative matrix factorization is proposed. Firstly, a consistent subspace representation is learned from the Wikipedia and Pascal datasets. Then, based on the subspace, a method of mapping real-time samples into subspaces is designed. Compared with the canonical correlation analysis, semantic matching and partial least squares regression, the proposed method has the best performance in terms of MAP and PR curves. The results demonstrate that the proposed algorithm has the potential ability in the task of cross-modal information retrieval.

Key words: multi-view clustering, symmetric nonnegative matrix factorization, cross-modal retrieval, subspace learning

CLC Number: 

  • TP391
[1] 丁恒, 陆伟. 基于相关性的跨模态信息检索研究[J]. 现代图书情报技术, 2016,(1):17-23. DING Heng, LU Wei. A study on correlation-based cross-modal information retrieval[J]. New Technology of Library and Information Service, 2016,(1):17-23.
[2] BARNARD K, FORSYTH D. Learning the semantics of words and pictures[C] //Proceedings Eighth IEEE International Conference on Computer Vision. Vancouver: IEEE, 2001: 408-415..
[3] DENOYER L, GALLINARI P. Bayesian network model for semi-structured document classification[J]. Information Processing and Management, 2004, 40(5): 807-827.
[4] SCLAROFF S, CASCIA M L, SETHI S, et al. Unifying textual and visual cues for content-based image retrieval on the world wide web[J]. Computer Vision and Image Understanding, 1999, 75(1/2):86-98.
[5] RASIWASIA N, COSTA PEREIRA J, COVIELLO E, et al. A new approach to cross-modal multimedia retrieval[C] //Proceedings of the 18th ACM International Conference on Multimedia. Firenze: ACM, 2010: 251-260.
[6] 马园园. 基于对称非负矩阵分解的信息融合方法与应用研究[D]. 武汉: 华中师范大学, 2018. MA Yuanyuan. Information fusion methods and application based on symmetric nonnegative matrix factorization[D]. Wuhan: Huazhong Normal University, 2018.
[7] PEREIRA J C, COVIELLO E, DOYLE G, et al. On the role of correlation and abstraction in cross-modal multimedia retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 36(3):521-535.
[8] 冯方向. 基于深度学习的跨模态检索研究[D]. 北京: 北京邮电大学, 2015. FENG Fangxiang. Deep learning for cross-modal retrieval[D]. Beijing: Beijing University of Posts and Telecommunications, 2015.
[9] CHAUDHURI K, KAKADE S M, LIVESCU K, et al. Multi-view clustering via canonical correlation analysis[C] //Proceedings of the 26th Annual International Conference on Machine Learning. Montreal: ACM, 2009: 129-136.
[10] HARDOON D R, SZEDMAK S, SHAWE-TAYLOR J J N C. Canonical correlation analysis: an overview with application to learning methods[J]. Neural Computation, 2004, 16(12):2639-2664.
[11] LIU X, SU L, JIANG D, et al. Cross-modal retrieval of Chinese-CQA based on CCA algorithm[C] //Proceedings of 2018 International Conference on Computational, Modeling, Simulation and Mathematical Statistics. [S.l.] : DEStech, 2018: 326-333.
[12] 李志义, 黄子风, 许晓绵. 基于表示学习的跨模态检索模型与特征抽取研究综述[J]. 情报学报, 2018, 37(4):422-435. LI Zhiyi, HUANG Zifeng, XU Xiaojin. A review of the cross-modal retrieval model and feature extraction based on representation learning[J]. Journal of The China Society for Scientific and Technical Information, 2018, 37(4):422-435.
[13] 邵杰. 基于深度学习的跨模态检索[D]. 北京: 北京邮电大学, 2017. SHAO Jie. Cross-modal retrieval based on deep learning[D]. Beijing: Beijing University of Posts and Telecommunications, 2017.
[14] DHILLON P, FOSTER D P, UNGAR L H. Multi-view learning of word embeddings via cca[C] //Advances in Neural Information Processing Systems. Granada: NeurlIPS, 2011: 199-207.
[15] ZHENG W, ZHOU X, ZOU C, et al. Facial expression recognition using kernel canonical correlation analysis(KCCA)[J]. IEEE Transactions on Neural Networks, 2006, 17(1):233-238.
[16] BACH F R, LANCKRIET G R, JORDAN M I. Multiple kernel learning, conic duality, and the SMO algorithm[C] //Proceedings of the Twenty-first International Conference on Machine Learning. New York: ACM, 2004: 1-8.
[17] RASIWASIA N, MORENO P J, VASCONCELOS N. Bridging the gap: query by semantic example[J]. IEEE Transactions on Multimedia, 2007, 9(5):923-938.
[18] 司守奎,孙兆亮. 数学建模算法与应用[M]. 北京:国防工业出版社, 2015. SI Shoukui, SUN Zhaoliang. Mathematical modeling[M]. Beijing: National Defense Industry Press, 2015.
[19] ROSIPAL R, KRÄMER N. Overview and recent advances in partial least squares[C] //International Statistical and Optimization Perspectives Workshop “Subspace, Latent Structure and Feature Selection”. Bohinj: Springer, 2005: 34-51.
[20] WU Y, WANG S, HUANG Q. Multi-modal semantic autoencoder for cross-modal retrieval[J]. Neurocomputing, 2019, 331:165-175.
[21] XU M, ZHU Z, ZHAO Y, et al. Subspace learning by kernel dependence maximization for cross-modal retrieval[J]. Neurocomputing, 2018, 309:94-105.
[22] KUANG D, YUN S, PARK H. SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering[J]. Journal of Global Optimization, 2015, 62(3):545-574.
[23] KUANG D, DING C, PARK H. Symmetric nonnegative matrix factorization for graph clustering[C] //Proceedings of the 2012 SIAM International Conference on Data Mining. California: SIAM, 2012: 106-117.
[24] ZELNIK-MANOR L, PERONA P. Self-tuning spectral clustering[C] //Advances in Neural Information Processing Systems. Vancouver: NerulIPS, 2005: 1601-1608.
[25] NG A Y, JORDAN M I, WEISS Y. On spectral clustering: analysis and an algorithm[C] //Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic. [S.l.] : MIT Press, 2001: 849-856.
[26] LEE D D, SEUNG H S. Algorithms for non-negative matrix factorization[C] //Neural Information Processing Systems. Vancouver: NeurlIPS, 2001: 556-562.
[27] LONG B, ZHANG Z, YU P S. Co-clustering by block value decomposition[C] //Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. Chicago: ACM, 2005: 635-640.
[28] SHI X, LU H, HE Y, et al. Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization[C] //Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Montreal: ACM, 2015: 541-546.
[29] MA X, GAO L, YONG X, et al. Semi-supervised clustering algorithm for community structure detection in complex networks[J]. Physica A: Statistical Mechanics and Its Applications, 2010, 389(1):187-197.
[30] MA Y, HU X, HE T, et al. Clustering and integrating of heterogeneous microbiome data by joint symmetric nonnegative matrix factorization with Laplacian regularization[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 17(3):788-795.
[31] MA Y, HU X, HE T, et al. Multi-view clustering microbiome data by joint symmetric nonnegative matrix factorization with Laplacian regularization[C] //Bioinformatics and Biomedicine(BIBM). Shenzhen: IEEE, 2016: 625-630.
[32] JIANG X, HU X, XU W. Microbiome data representation by joint nonnegative matrix factorization with Laplacian regularization[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 14(2):353-359.
[33] DU R, DRAKE B, PARK H. Hybrid clustering based on content and connection structure using joint nonnegative matrix factorization[J]. Journal of Global Optimization, 2019, 74(4):861-877.
[34] GUAN Z, ZHANG L, PENG J, et al. Multi-view concept learning for data representation[J]. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(11):3016-3028.
[35] RASHTCHIAN C, YOUNG P, HODOSH M, et al. Collecting image annotations using Amazons mechanical turk[C] //Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazons Mechanical Turk. Honolulu: ACM, 2010: 139-147.
[1] San-li YI,Jian-ting CHEN,Jian-feng HE. ASR-UNet: an improved retinal vessels segmentation algorithm based on attention mechanism [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(9): 13-20.
[2] Jing-hong WANG,Li-na LIANG,Hao-kang LI,Yi ZHOU. Community discovery algorithm based on attention network feature [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(9): 1-12,20.
[3] WANG Wei-yu, SHI Cun-hui, YU Xiao-ming, LIU Yue, CHENG Xue-qi. An extractive topic brief representation generation method to event [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(5): 66-75.
[4] Yi-ming ZHANG,Guo-yin WANG,Jun HU,Shun FU. Overlapping community detection based on density peaks and network embedding [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(1): 91-102.
[5] Kan XU,Rui-xin LIU,Hong-fei LIN,Hai-feng LIU,Jiao-jiao FENG,Jia-ping LI,Yuan LIN,Bo XU. Academic paper recommendation based on heterogeneous network embedding [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(11): 35-45.
[6] ZHANG Ling, REN Xue-fang. Intelligent data classification and intelligent retrieval-recognition of class [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(10): 7-14.
[7] Ming-xing LIN. Low-light image enhancement algorithm based on variational structure [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(9): 72-80.
[8] Jia-qi WANG,Mu-yun YANG,Tie-jun ZHAO,Zhen-yu ZHAO. Construction of retrieval dataset of procuratorate legal documents [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(7): 81-87.
[9] Ying YU,Xin-nian WU,Le-wei WANG,Ying-long ZHANG. A multi-label three-way classification algorithm based on label correlation [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(3): 81-88.
[10] Liu-ying WEN,Wei YUAN. Clustering method for multi-label symbolic value partition [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(3): 58-69.
[11] Min-qing ZHANG,Neng ZHOU,Meng-meng LIU,Han WANG,Yan KE. Reversible data hiding in homomorphic encrypted domain based on Paillier [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(3): 1-8,18.
[12] Xin-le WANG,Wen-feng YANG,Hua-ming LIAO,Yong-qing WANG,Yue LIU,Xiao-ming YU,Xue-qi CHENG. Topic tag popularity prediction based on multi-dimensional features [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(1): 94-101.
[13] Ni LI,Huan-mei GUAN,Piao YANG,Wen-yong DONG. BERT-IDCNN-CRF for named entity recognition in Chinese [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(1): 102-109.
[14] YANG Ya-ru, WANG Yong-qing, ZHANG Zhi-bin, LIU Yue, CHENG Xue-qi. Social network user identity linkage model based on comprehensive information [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(9): 105-113.
[15] ZHANG Di, ZHA Dong-dong, LIU Hua-yong. Construction of the cubic λμ-α-DP curve with two kinds of shape parameters [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(9): 114-126.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!