您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2014, Vol. 49 ›› Issue (12): 12-17.doi: 10.6040/j.issn.1671-9352.3.2014.182

• 论文 • 上一篇    下一篇

面向半监督中文事件抽取的事件推理方法

徐霞, 李培峰, 郑新, 朱巧明   

  1. 苏州大学计算机科学与技术学院, 江苏 苏州 215006
  • 收稿日期:2014-08-28 修回日期:2014-10-24 出版日期:2014-12-20 发布日期:2014-12-20
  • 作者简介:徐霞(1989- ),女,硕士研究生,主要研究方向为中文信息处理. E-mail:xuxia1125@163.com
  • 基金资助:
    国家自然科学基金资助项目(61272260);江苏省自然基金资助项目(BK2011282);江苏省高校自然科学重大基础研究项目(11KIJ520003)

Event inference for semi-supervised Chinese event extraction

XU Xia, LI Pei-feng, ZHENG Xin, ZHU Qiao-ming   

  1. School of Computer Science and Technology, Soochow University, Suzhou 215006, Jiangsu, China
  • Received:2014-08-28 Revised:2014-10-24 Online:2014-12-20 Published:2014-12-20

摘要: 半监督中文事件抽取系统的性能依赖于种子模板,但自动获取的种子模板的表达方式与覆盖范围有限,导致某些语言现象下的事件实例很难被识别。为解决这一难题,基于篇章内的事件一致性理论提出基于同指事件与相关事件的推理方法,根据已抽取的事件实例来推理可能有同指关系与关联性的其它事件,从而进一步提高半监督中文事件抽取系统的性能。在ACE 2005中文语料上的测试表明,该方法可有效地提高半监督中文信息事件抽取系统的性能。

关键词: 事件抽取, 半监督, 事件推理

Abstract: The performance semi-supervised Chinese event extraction depends on the quality of seed patterns. However, the expression styles and coverage of those seed patterns, which are extracted automatically, is limited and that leads to lots of event mentions cannot be identified for their contexts. To solve this issue, an event inference mechanism based on co-reference events and relevant events was proposed, which follows the theory of event consistency in a topic. This mechanism can infer those event mentions which have the co-reference or relevance relations with the extracted event mentions in the same document, and then the performance of semi-supervised Chinese event extraction was further improved. The experimental results on the ACE 2005 Chinese corpus show that our approach outperforms the baseline significantly.

Key words: semi-supervised, event extraction, event inference

中图分类号: 

  • TP391
[1] RILOFF E. Automatically generating extraction patterns from untagged text[C]// Proceedings of the 10th National Conference on Artificial Intelligence (AAAI-96). [S.l.]: AAAI Press, 1996: 1044-1049.
[2] YANGARBER R, GRISHMAN R, TAPANAINEN P, et al. Automatic acquisition of domain knowledge for information extraction[C]// Proceedings of the 18th Conference on Computational linguistics(COLING'00). Stroudsburg, PA, USA: Association for Computational Linguistics, 2000: 940-946.
[3] YANGARBER R. Counter-training in discovery of semantic patterns[C]// Proceedings of the 41th Annual Meeting of the Association for Computational Linguistics(ACL'05). Stroudsburg, PA, USA: Association for Computational Linguistics, 2003: 343-350.
[4] HUANG Ruihong, RILOFF E. Bootstrapped training of event extraction classifiers[C]// Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2012. 2012: 286-295.
[5] William Phillips, Ellen Riloff. Exploiting role-identifying nouns and expressions for information extraction[C]// Proceedings of Recent Advances in Natural Language Processing (RANLP-07). [S.l.]: [s.n.], 2007: 468-473.
[6] STEVENSONM, GREENWOOD M. A semantic approach to IE pattern induction[C]// Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics(ACL'05). Stroudsburg, PA, USA: Association for Computational Linguistics, 2005: 379-386.
[7] LIAO Shasha, GRISHMAN R. Filtered ranking for bootstrapping in event extraction[C]// Proceedings of the 23rd International Conference on Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2010: 680-688.
[8] LIAO Shasha, GRISHMAN R. Can document selection help semi-supervised learning? a case study on event extraction[C]// Proceedings of the 49th Annual Meeting of the Association of Computational Linguistics (ACL'11). Stroudsburg, PA, USA: Association for Computational Linguistics, 2011: 260-265.
[9] CHEN Zheng, JI Heng. Can one language bootstrap the other: a case study on event extraction[C]// Proceedings of HLT-NAACL 2009 Workshop on Semi-supervised Learning for Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2009: 66-74.
[10] FINKEL J R, GRENAGER T, MANNING C. Incorporating non-local information into information extraction systems by Gibbs sampling[C]// Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics(ACL'05). Stroudsburg, PA, USA: Association for Computational Linguistics, 2005: 363-370.
[11] MASLENNIKOV M, CHUA T-S. A multi resolution framework for information extraction from free text[C]// Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL'07). Stroudsburg, PA, USA: Association for Computational Linguistics, 2007: 592-599.
[12] JI Heng, GRISHMAN R. Refining event extraction through cross-document inference[C]// Proceedings of the 46th Annual Meeting on Association for Computational Linguistics (ACL'08). Stroudsburg, PA, USA: Association for Computational Linguistics, 2008: 254-262.
[13] YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods[C]// Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics (ACL'95). Stroudsburg, PA, USA: Association for Computational Linguistics, 1995: 189-196.
[14] PATWARDHAN S, RILOFF E. Effective information extraction with semantic affinity patterns and relevant regions[C]// Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg, PA, USA: Association for Computational Linguistics, 2007: 717-727.
[15] LIAO Shasha, GRISHMAN R. Using document level cross-event inference to improve event extraction[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics(ACL'10). Stroudsburg, PA, USA: Association for Computational Linguistics, 2010: 789-797.
[16] 徐霞.半监督中文事件抽取方法的研究[D].苏州:苏州大学,2014. XU Xia. Research on semi-supervised Chinese event extraction[D]. Soochow: Soochow University, 2014.
[17] 刘群, 李素建. 基于《知网》的词汇语义相似度计算[J]. 中文计算语言学, 2002, 7(2):59-76. LIU Qun, LI Sujian. Vocabulary semantic similarity calculation based on HowNet [J]. Computational Linguistics and Chinese Language Processing, 2002, 7(2):59-76.
[1] 张鹏,王素格,李德玉,王杰. 一种基于启发式规则的半监督垃圾评论分类方法[J]. 山东大学学报(理学版), 2017, 52(7): 44-51.
[2] 林丽. 基于核心依存图的新闻事件抽取[J]. 山东大学学报(理学版), 2016, 51(9): 121-126.
[3] 苏丰龙,谢庆华,黄清泉,邱继远,岳振军. 基于直推式学习的半监督属性抽取[J]. 山东大学学报(理学版), 2016, 51(3): 111-115.
[4] 杜红乐,张燕,张林. 不均衡数据集下的入侵检测[J]. 山东大学学报(理学版), 2016, 51(11): 50-57.
[5] 李风环, 郑德权, 赵铁军. 基于浅层语义分析的主题事件的时间识别[J]. 山东大学学报(理学版), 2015, 50(11): 74-80.
[6] 曹雷1,2,郭嘉丰1,程学旗1. 基于二部图半监督方法的查询日志实体挖掘[J]. J4, 2012, 47(5): 32-37.
[7] 杨洋,王立宏*,刘其成. 一种主动式的半监督最近邻学习方法[J]. J4, 2011, 46(5): 110-115.
[8] 梁军1,2,陈龙2,周卫琪2,陶文倩1,姚明2,胥正川3. 基于马尔科夫随机场和鲁棒误差函数的半监督分类研究[J]. J4, 2010, 45(11): 1-4.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!