您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2017, Vol. 52 ›› Issue (6): 40-48.doi: 10.6040/j.issn.1671-9352.0.2017.059

• • 上一篇    下一篇



  1. 1.大连理工大学计算机科学与技术学院, 辽宁 大连 116024;2.大连大学信息工程与技术学院, 辽宁 大连 116622
  • 收稿日期:2017-02-20 出版日期:2017-06-20 发布日期:2017-06-21
  • 通讯作者: 林鸿飞(1962— ),男,博士,教授,博士生导师,研究方向为信息检索、搜索引擎、文本挖掘、自然语言处理、机器学习、生物信息学等. E-mail: hflin@dlut.edu.cn E-mail:qinjing@dlu.edu.cn
  • 作者简介:秦静(1981— ),女,博士研究生,讲师,研究方向为信息检索、机器学习. E-mail:qinjing@dlu.edu.cn
  • 基金资助:

Music retrieval model based on semantic descriptions

QIN Jing1,2, LIN Hong-fei1*, XU Bo1   

  1. 1. School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China;
    2. College of Information Engineering, Dalian University, Dalian 116622, Liaoning, China
  • Received:2017-02-20 Online:2017-06-20 Published:2017-06-21

摘要: 基于语义描述的音乐检索是根据音乐所表达的语义和对音乐的主观感受,查找或发现音乐的一种方式。一个典型的基于语义描述的检索(query by semantic description, QBSD)系统被定义为有监督的多类别标记(supervised multi-class labeling, SML)模型,通过使用语义相关标签来标记未知,将音乐映射到一个“语义空间”,从而克服语义鸿沟问题。在SML模型基础上,提出将示例音乐作为检索条件,通过对音乐语义的标注将检索示例映射到语义空间,然后在标记后的数据库中,返回语义相似的音乐。并且采用深度学习算法,设计了多类别标记模型。实验表明该模型能够满足用户基于语义音乐检索的基本需要。

关键词: 有监督多类别标记, 卷积神经网络, 语义描述检索, 音乐检索

Abstract: Query by semantic description(QBSD)is a natural way to retrieve and discovery relevant music based on semantic contents and users’ subjective feelings. A QBSD system can be defined as a supervised multi-class labeling(SML)model for bridging the semantic gaps, by which a song could be tagged using semantic labels and mapped into semantic spaces. In this paper, we propose a method for querying by semantic description based on the SML model, in which a song represented as a semantic vector could be used as a query, and retrieved within the tagged music dataset. The resulted song list contains most similar songs in the semantic space. A convolutional neural network is also integrated into the SML model. The experiments show that the proposed method could obtain relevant pieces of music in the same semantic space effectively and efficiently.

Key words: convolutional neural network, music retrieval, query by semantic description, supervised Multi-class labeling


  • TP391
[1] 比达网.2016上半年度手机音乐市场研究报告[EB/OL].[2016-06-12].http://www.bigdata-research.cn/content/201606/285.html.
[2] NANOPOULOS A, RAFAILIDIS D, RUXANDA M M, et al. Music search engines: specifications and challenges[J]. Information Processing & Management, 2009, 45(3):392-396.
[3] KARATZOGLOU A, AMATRIAIN X, BALTRUNAS L, et al. Multiverse recommendation: n-dimensional tensor factorization for context-aware collaborative filtering[C] // Proceedings of the fourth ACM conference on Recommender systems(RecSys '10). New York: ACM, 2010: 79-86.
[4] CELMA O. Music recommendation and discovery[J]. Media, 2015, 11(1):7-8.
[5] JAWAHEER G, SZOMSZOR M, KOSTKOVA P. Comparison of implicit and explicit feedback from an online music recommendation service[C] // Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems(HetRec '10). New York: ACM, 2010: 47-51.
[6] LEVY M, BOSTEELS K. Music recommendation and the long tail[J]. Womrad Workshop on Music Recommendation & Discovery Acm Recsys, 2010, 33(3):1-20.
[7] SARWAR B, KARYPIS G, KONSTAN J, et al. Item-based collaborative filtering recommendation algorithms[C] // International Conference on World Wide Web. New York: ACM, 2001: 285-295.
[8] FBISCHOFF K, FIRAN C S, PAIU R, et al. Music mood and theme classification-a hybrid approach[C] // Proceedings of the 10th International Society for Music Information Retrieval Conference(ISMIR 2009). [S.l.] : DBLP, 2009: 657-662.
[9] SORDO M, GOUYON F, SARMENTO L, et al. Inferring semantic facets of a music Folksonomy with Wikipedia[J]. Journal of New Music Research, 2013, 42(4):346-363.
[10] SCHEDL M, WIDMER G, KNEES P, et al. A music information system automatically generated via web content mining techniques[J]. Information Processing & Management, 2011, 47(3):426-439.
[11] CASEY M A, VELTKAMP R, GOTO M, et al. Content-based music information retrieval: current directions and future challenges[J]. Proceedings of the IEEE, 2008, 96(4):668-696.
[12] WANG J, DENG H, YAN Q, et al. A collaborative model of low-level and high-level descriptors for semantics-based music information retrieval[C] // IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. Piscataway: IEEE, 2008: 532-535.
[13] BUCCOLI M, GALLO A, ZANONI M, et al. A dimensional contextual semantic model for music description and retrieval[C] // International Conference on Acoustics Speech and Signal Processing(ICASSP). New York: IEEE, 2015: 673-677.
[14] BUCCOLI M, ZANONI M, SARTI A, et al. A music search engine based on semantic text-based query[C] // IEEE International Workshop on Multimedia Signal Processing. New York: IEEE, 2013: 254-259.
[15] MIOTTO R, LANCKRIET G. A generative context model for semantic music annotation and retrieval[J]. IEEE Transactions on Audio Speech & Language Processing, 2012, 20(4):1096-1108.
[16] TURNBULL D R, BARRINGTON L, LANCKRIET G, et al. Combining audio content and social context for semantic music discovery[C] // Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: Assoc Computing Machinery, 2009: 387-394.
[17] SAARI P, EEROLA T. Semantic computing of moods based on tags in social media of music[J]. IEEE Transactions on Knowledge & Data Engineering, 2013, 26(10):2548-2560.
[18] SU J H, WANG C Y, CHIU T W, et al. Semantic content-based music retrieval using audio and fuzzy-music-sense features[C] // IEEE International Conference on Granular Computing. New York: IEEE, 2014: 259-264.
[19] FOSTER P, MAUCH M, DIXON S. Sequential complexity as a descriptor for musical similarity [J]. Processing IEEE/ACM Transactions on Audio Speech & Language, 2014, 22(12):1965-1977.
[20] TURNBULL D, BARRINGTON L, TORRES D, et al. Semantic annotation and retrieval of music and sound effects[J]. IEEE Transactions on Audio Speech & Language Processing, 2008, 16(2):467-476.
[21] LEE H, PHAM P T, YAN L, et al. Unsupervised feature learning for audio classification using convolutional deep belief networks[C] // Advances in Neural Information Processing Systems.[S.l.] : DBLP, 2009: 1096-1104.
[22] FROME A, CORRADO G S, SHLENS J, et al. DeViSE: a deep visual-semantic embedding model[C] // Proceedings of the 26th International Conference on Neural Information Processing Systems(NIPS'13).[S.l.] : Curran Associates Inc, 2013: 2121-2129.
[23] NODA K, YAMAGUCHI Y, NAKADAI K, et al. Audio-visual speech recognition using deep learning[J]. Applied Intelligence, 2015, 42(4):722-737.
[24] HAMEL P, ECK D. Learning features from music audio with deep belief networks[C] // International Society for Music Information Retrieval Conference.[S.l.] : DBLP, 2010: 339-344.
[25] DIELEMAN S, BRAKEL P, SCHRAUWEN B. Audio-based music classification with a pretrained convolutional network[C] // Proceedings of the 12th International Society for Music Information Retrieval Conference(ISMIR 2011). [S.l.] : DBLP, 2011: 669-674.
[26] 胡振, 傅昆, 张长水. 基于深度学习的作曲家分类问题[J]. 计算机研究与发展, 2014, 51(9):1945-1954. HU Zhen, FU Kun, ZHANG Changshui. Audio classical composer identification by deep neural network[J]. Journal of Computer Research and Development, 2014, 51(9):1945-1954.
[27] HUMPHREY E J, CHO T, BELLO J P. Learning a robust Tonnetzspace transform for automatic chord recognition[C] // Proceedings of the 37th IEEE International Conference on Acoustics, Speech and SignalProcessing(ICASSP). Piscataway: IEEE, 2012: 453-456.
[28] HINTON G, DENG L, YU D, et al. Deep neural networks foracoustic modeling in speech recognition: the shared views offour research groups[J]. Signal Processing Magazine, 2012, 29(6):82-97.
[29] COVIELLO E, CHAN A B, LANCKRIET G. Time series models for semantic music annotation[J]. IEEE Transactions on Audio Speech & Language Processing, 2011, 19(5):1343-1359.
[30] HOFFMAN M D, BLEI D M, COOK P R. Easy as CBA: a simple probabilistic model for tagging music[C] // International Society for Music Information Retrieval Conference.[S.l.] : DBLP, 2010: 369-374.
[31] ECK D, LAMERE P, BERTIN-MAHIEUX T, et al. Automatic generation of social tags for music recommendation[C] // Conference on Neural Information Processing Systems.[S.l.] : DBLP, 2007: 385-392.
[1] 张芳芳,曹兴超. 基于字面和语义相关性匹配的智能篇章排序[J]. 山东大学学报(理学版), 2018, 53(3): 46-53.
Full text



No Suggested Reading articles found!