山东大学学报(理学版) ›› 2014, Vol. 49 ›› Issue (12): 12-17.doi: 10.6040/j.issn.1671-9352.3.2014.182
徐霞, 李培峰, 郑新, 朱巧明
XU Xia, LI Pei-feng, ZHENG Xin, ZHU Qiao-ming
摘要: 半监督中文事件抽取系统的性能依赖于种子模板,但自动获取的种子模板的表达方式与覆盖范围有限,导致某些语言现象下的事件实例很难被识别。为解决这一难题,基于篇章内的事件一致性理论提出基于同指事件与相关事件的推理方法,根据已抽取的事件实例来推理可能有同指关系与关联性的其它事件,从而进一步提高半监督中文事件抽取系统的性能。在ACE 2005中文语料上的测试表明,该方法可有效地提高半监督中文信息事件抽取系统的性能。
中图分类号:
[1] RILOFF E. Automatically generating extraction patterns from untagged text[C]// Proceedings of the 10th National Conference on Artificial Intelligence (AAAI-96). [S.l.]: AAAI Press, 1996: 1044-1049. [2] YANGARBER R, GRISHMAN R, TAPANAINEN P, et al. Automatic acquisition of domain knowledge for information extraction[C]// Proceedings of the 18th Conference on Computational linguistics(COLING'00). Stroudsburg, PA, USA: Association for Computational Linguistics, 2000: 940-946. [3] YANGARBER R. Counter-training in discovery of semantic patterns[C]// Proceedings of the 41th Annual Meeting of the Association for Computational Linguistics(ACL'05). Stroudsburg, PA, USA: Association for Computational Linguistics, 2003: 343-350. [4] HUANG Ruihong, RILOFF E. Bootstrapped training of event extraction classifiers[C]// Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2012. 2012: 286-295. [5] William Phillips, Ellen Riloff. Exploiting role-identifying nouns and expressions for information extraction[C]// Proceedings of Recent Advances in Natural Language Processing (RANLP-07). [S.l.]: [s.n.], 2007: 468-473. [6] STEVENSONM, GREENWOOD M. A semantic approach to IE pattern induction[C]// Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics(ACL'05). Stroudsburg, PA, USA: Association for Computational Linguistics, 2005: 379-386. [7] LIAO Shasha, GRISHMAN R. Filtered ranking for bootstrapping in event extraction[C]// Proceedings of the 23rd International Conference on Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2010: 680-688. [8] LIAO Shasha, GRISHMAN R. Can document selection help semi-supervised learning? a case study on event extraction[C]// Proceedings of the 49th Annual Meeting of the Association of Computational Linguistics (ACL'11). Stroudsburg, PA, USA: Association for Computational Linguistics, 2011: 260-265. [9] CHEN Zheng, JI Heng. Can one language bootstrap the other: a case study on event extraction[C]// Proceedings of HLT-NAACL 2009 Workshop on Semi-supervised Learning for Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2009: 66-74. [10] FINKEL J R, GRENAGER T, MANNING C. Incorporating non-local information into information extraction systems by Gibbs sampling[C]// Proceedings of the 43th Annual Meeting of the Association for Computational Linguistics(ACL'05). Stroudsburg, PA, USA: Association for Computational Linguistics, 2005: 363-370. [11] MASLENNIKOV M, CHUA T-S. A multi resolution framework for information extraction from free text[C]// Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL'07). Stroudsburg, PA, USA: Association for Computational Linguistics, 2007: 592-599. [12] JI Heng, GRISHMAN R. Refining event extraction through cross-document inference[C]// Proceedings of the 46th Annual Meeting on Association for Computational Linguistics (ACL'08). Stroudsburg, PA, USA: Association for Computational Linguistics, 2008: 254-262. [13] YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods[C]// Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics (ACL'95). Stroudsburg, PA, USA: Association for Computational Linguistics, 1995: 189-196. [14] PATWARDHAN S, RILOFF E. Effective information extraction with semantic affinity patterns and relevant regions[C]// Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg, PA, USA: Association for Computational Linguistics, 2007: 717-727. [15] LIAO Shasha, GRISHMAN R. Using document level cross-event inference to improve event extraction[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics(ACL'10). Stroudsburg, PA, USA: Association for Computational Linguistics, 2010: 789-797. [16] 徐霞.半监督中文事件抽取方法的研究[D].苏州:苏州大学,2014. XU Xia. Research on semi-supervised Chinese event extraction[D]. Soochow: Soochow University, 2014. [17] 刘群, 李素建. 基于《知网》的词汇语义相似度计算[J]. 中文计算语言学, 2002, 7(2):59-76. LIU Qun, LI Sujian. Vocabulary semantic similarity calculation based on HowNet [J]. Computational Linguistics and Chinese Language Processing, 2002, 7(2):59-76. |
[1] | 张鹏,王素格,李德玉,王杰. 一种基于启发式规则的半监督垃圾评论分类方法[J]. 山东大学学报(理学版), 2017, 52(7): 44-51. |
[2] | 林丽. 基于核心依存图的新闻事件抽取[J]. 山东大学学报(理学版), 2016, 51(9): 121-126. |
[3] | 苏丰龙,谢庆华,黄清泉,邱继远,岳振军. 基于直推式学习的半监督属性抽取[J]. 山东大学学报(理学版), 2016, 51(3): 111-115. |
[4] | 杜红乐,张燕,张林. 不均衡数据集下的入侵检测[J]. 山东大学学报(理学版), 2016, 51(11): 50-57. |
[5] | 李风环, 郑德权, 赵铁军. 基于浅层语义分析的主题事件的时间识别[J]. 山东大学学报(理学版), 2015, 50(11): 74-80. |
[6] | 曹雷1,2,郭嘉丰1,程学旗1. 基于二部图半监督方法的查询日志实体挖掘[J]. J4, 2012, 47(5): 32-37. |
[7] | 杨洋,王立宏*,刘其成. 一种主动式的半监督最近邻学习方法[J]. J4, 2011, 46(5): 110-115. |
[8] | 梁军1,2,陈龙2,周卫琪2,陶文倩1,姚明2,胥正川3. 基于马尔科夫随机场和鲁棒误差函数的半监督分类研究[J]. J4, 2010, 45(11): 1-4. |
|