您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2016, Vol. 51 ›› Issue (9): 121-126.doi: 10.6040/j.issn.1671-9352.1.2015.C01

• • 上一篇    下一篇

基于核心依存图的新闻事件抽取

林丽   

  1. 中国人民解放军外国语学院, 河南 洛阳 471003
  • 收稿日期:2015-11-14 出版日期:2016-09-20 发布日期:2016-09-23
  • 作者简介:林丽(1979— ),女,博士,讲师,研究方向为框架语义学、越南语语言信息处理. E-mail: lamle@163.com

News event extraction based on kernel dependency graph

LIN Li   

  1. PLA University of Foreign Languages, Luoyang 471003, Henan, China
  • Received:2015-11-14 Online:2016-09-20 Published:2016-09-23

摘要: 基于核心依存图(kernel dependency graph,KDG)的事件抽取主要通过语义结构进行匹配。在已构建的越-英-汉南海新闻框架网络基础上,主要对KDG语义分析模式、基于核心依存图的新闻事件抽取分析以及核心依存图生成和新闻事件信息抽取进行研究。研究重点包括典型KDG、零形式框架元素和框架元素融合等特殊KDG的分析模式,面向事件信息抽取的KDG的表示模式和标注例句自动生成KDG的过程。研究结果表明,基于KDG的事件信息抽取方法直观明了、语言学理据充分,具有一定的可行性,对新闻文本中的语义线索发现较为适合。目前已经可以从已完成框架语义标注的例句中自动生成KDG并抽取出相应的事件模型。

关键词: 事件抽取, 新闻, 越南语, 核心依存

Abstract: The main process of event extraction based on kernel dependency graph(KDG)is to find and match the semantic structure. A Vietnamese-English-Chinese FrameNet on the South China Sea news has been built to explore the specific application of semantic structure extraction including KDG semantic analysis model, KGD automatic generation and event templates extraction based on KDG. The research focuses are the analysis models for typical KDGs, special KDGs of Null Instantiated Frame Element and frame element fusion; the representation method of KDG for event information extraction; the process of KGD automatic generation from annotated sentences. It is demonstrated that event extraction based on KDG is driven by linguistic motivation and feasible to find out the semantic clues of news texts by its intuitiveness. At the present, it is possible to generate KDG automatically from the frame-semantic annotated sentences and extract the corresponding event templates.

Key words: event extraction, Vietnamese, kernel dependency, news

中图分类号: 

  • TP393
[1] 程显毅,朱倩,王进.中文信息抽取原理及应用[M].北京:科学出版社,2010.
[2] 冯志伟.自然语言处理的形式模型[M].合肥:中国科学技术大学出版社,2010.
[3] MOHIT B, NARAYANAN S. Semantic extraction with wide-coverage lexical resources[C] // Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology(Vol.2). Edmonton: Association for Computational Linguistics, 2003:64-66.
[4] COPPOLA B, MOSCHITTI A. A General Purpose FrameNet-based Shallow Semantic Parser[C] // Nicoletta Calzolari et al. Proceedings of the Seventh conference on International Language Resources and Evaluation(LREC'10). Valletta, Malta: European Language Resources Association(ELRA), 2010: 1624-1627.
[5] FILLMORE C J, BAKER C F. Frame Semantics for Text Understanding[C] // Proceedings of WordNet and Other Lexical Resources Workshop. Pittsburgh: NAACL, 2001.
[6] 俞士汶,黄居仁.计算语言学前瞻[M].北京:商务印书馆,2005:44.
[7] FILLMORE C J, BAKER C F, SATO H.Seeing arguments through transparent sStructures[C] // Proceedings of the Third International Conference on Languag Resources and Evaluation(Vol. III). Las Palmas: LREC, 2002:787-791. (下转第150页)
[1] 李希鹏,郭岩,赵岭,张儒清,刘悦,俞晓明,程学旗. 基于事件的新闻客户端热门评论预测框架[J]. 山东大学学报(理学版), 2016, 51(3): 91-97.
[2] 莫媛媛, 郭剑毅,余正涛,毛存礼,牛翊童. 基于深层神经网络(DNN)的汉-越双语词语对齐方法[J]. 山东大学学报(理学版), 2016, 51(1): 77-83.
[3] 李风环, 郑德权, 赵铁军. 基于浅层语义分析的主题事件的时间识别[J]. 山东大学学报(理学版), 2015, 50(11): 74-80.
[4] 徐霞, 李培峰, 郑新, 朱巧明. 面向半监督中文事件抽取的事件推理方法[J]. 山东大学学报(理学版), 2014, 49(12): 12-17.
[5] 潘清清,周枫,余正涛,郭剑毅,线岩团. 基于条件随机场的越南语命名实体识别方法[J]. 山东大学学报(理学版), 2014, 49(1): 76-79.
[6] 刘伍颖,易绵竹,张兴. 一种时空高效的多类别文本分类算法[J]. J4, 2013, 48(11): 99-104.
[7] 赵丽丽,赵茜倩,杨娟,王铁军,李庆*. 财经新闻对中国股市影响的定量分析[J]. J4, 2012, 47(7): 70-75.
[8] 王太峰,袁平波,荚济民,俞能海 . 基于新闻环境的人物肖像检索[J]. J4, 2006, 41(3): 5-10 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!