J4 ›› 2013, Vol. 48 ›› Issue (11): 87-92.

• Articles • Previous Articles     Next Articles

Multi-source data fusion based on the expand vector space model

CHEN Ke-rui, PAN Jun   

  1. College of Computer and Information Engineering, Hennan University of Economics and Law,
    Zhengzhou 450002, Henan, China
  • Received:2013-09-02 Online:2013-11-20 Published:2013-11-25

Abstract:

The expansion of ontology resource is one of the key for the whole natural language processing. Since the information obtained traditionally from single data source could not reflect the overall picture and the coverage rate doesn’t reach targeted one, the construction of an integrated data management platform would be required to store and organize data sources by classification. The AVP data platform was  proposed firstly. In the process of data construction on AVP platform, the most important issue is to integrate multi-source data, in other words, to perform semantic role labeling on web data coming from different sources, to identify ambiguous entries, and to eventually merge into data warehouses which use sense as the basic unit. An automated method of semantic role matching has been suggested, and it would solve the problem of semantic role matching resulted from multi-source data fusion. The basic idea is to use attribute-values of entries as the feature template, and then apply expand vector space model to identity ambiguity for entries while assisted by the cooccurrence probability of attribute values. Through the massive experimental contrast, the system mentioned above performed very well in all respects. The theory and algorithm proposed in this paper could solve the problem of semantic role matching existed in multi-source data fusion effectively.

Key words: natural language processing; ontology; multisource data fusion; semantic role matching

CLC Number: 

  • TP391
[1] LIU Ya-hui1, 2, LIU Chun-yang3*, ZHANG Tie-ying1, CHENG Xue-qi1. An overview of graph indexing technology [J]. J4, 2013, 48(11): 44-52.
[2] YU Ran 1,2, LIU Chun-yang3*, JIN Xiao-long 1, WANG Yuan-zhuo 1, CHENG Xue-qi 1. Chinese spam microblog filtering based on the fusion of
multi-angle features
[J]. J4, 2013, 48(11): 53-58.
[3] ZHENG Jian-xing, ZHANG Bo-feng*, YUE Xiao-dong, CHENG Ze-yu. Research on themes recommendation in microblogging
scenario based on neighbor-user profile
[J]. J4, 2013, 48(11): 59-65.
[4] PENG Qing-xi, QIAN Tie-yun. Store review spam detection based on quantitative sentiment [J]. J4, 2013, 48(11): 66-72.
[5] HUANG Liang, DU Yong-ping. The method of latent friend recommendation based on the trust relations [J]. J4, 2013, 48(11): 73-79.
[6] ZHANG Nai-zhou1, CAO Wei 2, CHEN Ke-rui 1, LI Shi-jun3. A temporal-aware model for search engine [J]. J4, 2013, 48(11): 80-86.
[7] FANG Zhi-jun, LIU Xin-yun, WU Shi-qian, ZHENG Wen-juan. The multi-scale retinex algorithm for image enhancement based on
sub-band weighting fusion
[J]. J4, 2013, 48(11): 93-98.
[8] LIU Wu-ying, YI Mian-zhu, ZHANG Xing. A space-time-efficient multi-category text categorization algorithm [J]. J4, 2013, 48(11): 99-104.
[9] LI Yu-Qian, LIU Lin, LI Jin-Bing. Superposition principle of gray histograms in video analysis [J]. J4, 2009, 44(11): 63-67.
[10] XIE Hua, LIN Chang-Yuan, LIN Xue-Fang. Onedirection rough relations and security of data communication [J]. J4, 2009, 44(9): 93-96.
[11] XU Jie-ping1, YIN Hong-yu1, FAN Zi-wen2. Study on cover songs identification based on phrase content [J]. J4, 2013, 48(7): 68-71.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!