JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2016, Vol. 51 ›› Issue (3): 111-115.doi: 10.6040/j.issn.1671-9352.1.2015.C07

Previous Articles     Next Articles

Semi-supervised method for attribute extraction based on transductive learning

SU Feng-long1, XIE Qing-hua2*, HUANG Qing-quan1, QIU Ji-yuan1, YUE Zhen-jun1   

  1. 1. Institute of Communication Engineering, PLA University of Science and Technology, Nanjing 210007, Jiangsu, China;
    2. Institute of National Defense Engineering, PLA University of Science and Technology, Nanjing 210007, Jiangsu, China
  • Received:2015-10-30 Online:2016-03-20 Published:2016-04-07

Abstract: In the study of text information extraction, aiming at the default of traditional supervised learning method that manual text annotation is high time-consuming and large workload, an improved semi-supervised learning model was proposed. The model combined support vector machines advantage of classification with transductive learnings advantage of generalization, which used a small amount of tagged corpus to study and test and then added them to the training model to adjust the prediction gradually. In attributes extraction experiments, this model performed better compared with traditional support vector machine methods. The generalization ability of the model was strong, which saved a great deal of manual cost.

Key words: transductive support vector machine, semi-supervised learning, information extraction, attribute extraction

CLC Number: 

  • TP391
[1] 贾真,杨燕,何大可. 基于弱监督学习的中文百科数据属性抽取[J].电子科技大学学报,2014,43(5):758-763. JIA Zhen, YANG Yan, HE Dake. Attribute extraction of Chinese online encyclopedia based on weakly supervised learning[J].Journal of University of Electronic Science and Technology of China, 2014, 43(5):758-763.
[2] LI Jiwei, ALAN Ritter, EDUARD Hovy. Weakly supervised user profile extraction from Twitter[C] //Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, USA: Daniel Marcu, 2014:165-174.
[3] 张巧,熊锦华,程学旗. 基于弱监督学习的主页人物属性抽取[J].山西大学学报(自然科学版),2015,38(1):8-15. ZHANG Qiao, XIONG Jinhua, CHENG Xueqi. Person attributes extraction based on a weakly supervised learning method[J].Journal of Shanxi University(Natural Science Edition), 2015, 38(1):8-15.
[4] 余丽,陆锋,张恒才. 网络文本蕴含地理信息抽取:研究进展与展望[J].地球信息科,2015,17(2):127-134. YU Li, LU Feng, ZHANG Hengcai. Extracting geographic gnformation from geb texts: status and development[J].Journal of Geo-Information Science, 2015, 17(2):127-134.
[5] 程显毅,朱倩. 未定义类型的关系抽取的半监督学习框架研究[J].南京大学学报(自然科学版),2012,48(4):466-474. CHENG Xianyi, ZHU Qian. A study of relation extraction of undefined relation type based on semi-supervised learning framework[J].Journal of Nanjing University(Natural Science Edition), 2012, 48(4):466-474.
[6] 杨宇飞,戴齐,贾真,等. 基于弱监督的属性关系抽取方法[J].计算机应用,2014,34(1):64-68. YANG Yufei, DAI Qi, JIA Zhen, et al. Weakly supervised method for attribute relation extraction[J].Journal of Computer Applications, 2014, 34(1):64-68.
[7] 郭剑毅,李真,余正涛,等. 领域本体概念实例、属性和属性值的抽取及关系预测[J].南京大学学报(自然科学版),2012,48(4):383-389. GUO Jianyi, LI Zhen, YU Zhengtao, et al. Extraction and relation prediction of domain ontology concept instance, attribute and attribute value[J].Journal of Nanjing University(Natural Science Edition), 2012, 48(4):383-389.
[8] VAPNIK V. The nature of statistical learning theory[M].New York:Springer-Verlag, 1999.
[9] JOACHIMS T. Transductive inference for text classification using support vector machine[C] //Proceedings of the Sixteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann, 1999:148-156.
[10] ANSHU S. A novel classification technique based on progressive transductive SVM learning[J].Pattern Recognition Letters, 2014, 42:101-106.
[11] 吴飞,刘亚楠,庄越挺. 基于张量表示的直推式多模态视频语义概念检测[J].软件学报,2008,19(11):2853-2868. WU Fei, LIU Yanan, ZHUANG Yueting. Transductive multi-modality video semantic concept detection with tensor representation[J].Journal of Software, 2008, 19(11):2853-2868.
[12] 董静,孙乐,冯元勇,等. 中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-91. DONG Jing, SUN Le, FENG Yuanyong, et al. Chinese automatic entity relation extraction[J].Journal of Chinese Information Processing, 2007, 21(4):80-91.
[1] ZHANG Peng, WANG Su-ge, LI De-yu, WANG Jie. A semi-supervised spam review classification method based on heuristic rules [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 44-51.
[2] LI Zhi-heng, YANG Zhi-hao, LIN Hong-fei. Semantic output output-based disease-protein knowledge extraction [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(3): 104-110.
[3] DU Hong-le, ZHANG Yan, ZHANG Lin. Intrusion detection on imbalanced dataset [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(11): 50-57.
[4] ZHU Li-ping, LI Hong-qi, YANG Zhong-guo, LIU Qiang. An information extraction method for scientific literature introduction [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(07): 23-30.
[5] WANG Hui, CHEN Guang. Feature extraction method based on Bootstrapping in English product comment [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(12): 23-29.
[6] GUAN Mian, MA Jun. Automatic structured data extraction from Web forums [J]. J4, 2010, 45(5): 42-47.
[7] WANG Jing,YAO Yong,LIU Zhi-jing . Web information extraction based on a generalized hidden Markov model [J]. J4, 2007, 42(11): 49-52 .
[8] WANG Lei,CHEN Zhi-ping,LI Zhi-cheng . Using text blocks based on multiple templates hidden markov model for text information extraction [J]. J4, 2006, 41(3): 19-24 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!