JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2015, Vol. 50 ›› Issue (07): 9-16.doi: 10.6040/j.issn.1671-9352.3.2014.026

Previous Articles     Next Articles

Chinese Micro-blog named entity linking based on multisource knowledge

ZAN Hong-ying1, WU Yong-gang1, JIA Yu-xiang1, NIU Gui-ling2   

  1. 1. School of Information Engineering, Zhengzhou University, Zhengzhou 450001, Henan, China;
    2. School of Foreign Language, Zhengzhou University, Zhengzhou 450001, Henan, China
  • Received:2015-03-03 Online:2015-07-20 Published:2015-07-31

Abstract: Named entity is an important component conveying information in texts. Micro-blog is a social network platform used to share brief real-time information, with characteristics such as short text length, nonstandard words, and even the frequent emergence of neologisms.So an accurate understanding of the named entities is needed to ensure a correct analysis of the text information. A Chinese Micro-blog entity linking strategy was proposed based on multi-resource knowledge, combing the dictionary of synonyms, the encyclopedia resources as well as the bag-of-words model together to deal with named entity linking.In this strategy, named entities to be linked in Micro-blog were mapped to the corresponding candidate entities in the knowledge base. The evaluation results obtain a micro average accuracy of 92.97%, based on experiments using data sets of NLP& CC2013 Chinese micro-blog entity linking track. Compared with the state-of-the-art result, the accuracy of this method is two percent higher,which demonstrates the effectiveness of our method.

Key words: named entity, dictionary of synonyms, bag-of-words model, encyclopedia resources, Chinese Micro-blog entity linking

CLC Number: 

  • TP391
[1] LIU Xiaohua, LI Yitong, WU Haocheng, et al. Entity linking for tweets[C]//Proceedings of the 51th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2013: 1304-1311.
[2] BONTCH EVA K, ROUT D. Making sense of social media streams through semantics: a survey[J]. Semantic Web Journal, 2012. URL: http://www. semantic-web-journal.net/sites/default/files/swj303.pdf
[3] 赵军. 命名实体识别、排歧和跨语言关联[J]. 中文信息学报, 2009, 23(2):9-11. ZHAO Jun. Named entity recognition, disambiguation and cross lingual[J]. Chinese Information Processing, 2009, 23(2):9-11.
[4] GUO Yuhang, QIN Bing, LI Yuqin, et al. Improving candidate generation for entity linking[M]//Natural Language Processing and Information Systems. Berlin: Springer, 2013: 225-236.
[5] DILL S, EIRON N, GIBSON D, et al. SemTag and seeker: bootstrapping the semantic web via automated semantic annotation[C]//Proceedings of the 12th international conference on World Wide Web. New York: ACM, 2003:178-186.
[6] MIHALCEA R, CSOMAI A. Wikify!: linking documents to encyclopedic knowledge[C]//Proceedings of the sixteenth ACM Conference on Conference on Information and Knowledge Management. New York: ACM, 2007: 233-242.
[7] GABRILOVICH E, MARKOVITCH S. Computing semantic relatedness using wikipedia-based explicit semantic analysis[C]//IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence San Francisco: Morgan Kaufmann Publishers Inc, 2007: 1606-1611.
[8] HONNIBAL M, DALE R. DAMSEL: The DSTO/Macquarie system for entity-linking[J]//Proceeding of TAC, 2009. http://www.nist.gov/tac/publications/2009/participant.papers/DAMSEL. proceedings.pdf.
[9] BIKEL D, CASTELLI V, FLORIAN R, et al. Entity linking and slot filling through statistical processing and inference rules[C]//Proceeding of TAC 2009 Workshop. http://www.nist.gov/tac/publications/2009/participant.papers/IBM proceedings.pdf.
[10] HAN Xianpei, SUN Le. A generative entity-mention model for linking entities with knowledge base[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2011: 945-954.
[11] KULKARNI S, SINGH A, RAMAKRISHNAN G, et al. Collective annotation of Wikipedia entities in web text[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2009: 457-466.
[12] HAN Xianpei, SUN Le, ZHAO Jun. Collective entity linking in web text: a graph-based method[C]//Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2011: 765-774.
[13] PERALTA V. Extraction and integration of movielens and imdb data[R].France: Laboratoire PRiSM, Université de Versailles, 2007.
[14] LEY M. DBLP: some lessons learned[J]. Proceedings of the VLDB Endowment, 2009, 2(2):1493-1500.
[15] AUER S, BIZER C, KOBILAROV G, et al. Dbpedia: A nucleus for a web of open data[M]. Berlin: Springer, 2007: 722-735.
[16] SUCHANEK F M, KASNECI G, WEIKUM G. Yago: a large ontology from wikipedia and wordnet[J]. Web Semantics: Science, Services and Agents on the World Wide Web, 2008, 6(3):203-217.
[17] SUCHANEK F M, KASNECI G, WEIKUM G. Yago: a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web. New York: ACM, 2007: 697-706.
[18] BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008: 1247-1250.
[19] HASSELL J, ALEMAN-MEZA B, ARPINAR I B. Ontology-driven automatic entity disambiguation in unstructured text[M]. Berlin: Springer, 2006: 44-57.
[20] LU Yiming, NIE Zaiqing, CHENG Taoyuan, et al. Name disambiguation using Web connection[C]//Proceeding of the 19th National Conference on Artificial Intelligence (AAAI-00). California: American Association for Artifical Intelligence, 2007: 56-61
[21] KALASHNIKOV D V, NURAY-TURAN R, MEHROTRA S. Towards breaking the quality curse: a web-querying approach to web people search[C]//Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2008: 27-34.
[22] SILVIU Cucerzan. Large-scale named entity disambiguation based on wikipedia data[J]. Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2007: 708-716
[23] HAN Xianpei, ZHAO Jun. Named entity disambiguation by leveraging wikipedia semantic knowledge[J]. Proceedings of the 18th ACM Conference on Information and Knowledge Management. New York: ACM, 2009: 215-224.
[24] HAN Xianpei, ZHAO Jun. Structural semantic relatedness: a knowledge-based method to named entity disambiguation[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL,2010: 50-59.
[25] HAN Xianpei, SUN Le, ZHAO Jun. Collective entity linking in Web text: a graph-based method[C]//Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2011: 765-774.
[26] HAN Xianpei, SUN Le. A generative entity-mention model for linking entities with knowledge base[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2011: 945-954.
[27] LIU Xiaohua, ZHOU Ming, WEI Furu, et al. Joint inference of named entity recognition and normalization for tweets[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2012: 526-535.
[28] COHEN W, RAVIKUMAR P, FIENBERG S. A comparison of string metrics for matching names and records[C]//KDD Workshop on Data Cleaning and Object Consolidation. California: American Association for Artificial Intelligence, 2003, 3:73-78.
[29] 曹犟, 邬晓钧, 夏云庆, 等. 基于拼音索引的中文模糊匹配算法[J]. 清华大学学报: 自然科学版, 2009, 49(S1):1328-1332. CAO Jiang, WU Xiaojun, XIA Yunqing, et al. Pinyin-indexed method for approximate matching in Chinese[J]. Journal of Tsinghua University: Science and Technology, 2009, 49(S1):1328-1332.
[1] PAN Qing-qing, ZHOU Feng, YU Zheng-tao, GUO Jian-yi, XIAN Yan-tuan. Recognition method of Vietnamese named entity based on#br# conditional random fields [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(1): 76-79.
[2] CAO Lei1,2, GUO Jia-feng1, CHENG Xue-qi1. Bipartite graph based semi-supervised method for entity mining from the query log [J]. J4, 2012, 47(5): 32-37.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!