您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2014, Vol. 49 ›› Issue (12): 36-42.doi: 10.6040/j.issn.1671-9352.3.2014.139

• 论文 • 上一篇    下一篇

俄语武器装备名称共指词表构建

张溟, 唐慧丰, 李珠峰   

  1. 解放军外国语学院语言工程系, 河南 洛阳 471003
  • 收稿日期:2014-09-23 修回日期:2014-10-24 出版日期:2014-12-20 发布日期:2014-12-20
  • 作者简介:张溟(1990- ),女,硕士研究生,研究方向为语言信息处理.E-mail: silentglacierzm@sina.com

Coreference wordlist construction of Russian weapon

ZHANG Ming, TANG Hui-feng, LI Zhu-feng   

  1. Language Engineering Department, PLA University of Foreign Language, Luoyang 471003, Henan, China
  • Received:2014-09-23 Revised:2014-10-24 Online:2014-12-20 Published:2014-12-20

摘要: 在俄语军事领域文本的自动处理中,对武器装备名称进行共指消解是一个重要的研究内容。为了解决这一问题,采用模式匹配的方法,从维基百科的Infobox结构中同时提取共指词和模式,之后将二者结合成为新模式,再返回词条内容中迭代寻找共指词。实验结果以共指词表形式体现。计算结果表明该模式匹配方法能够准确有效地找出俄语维基百科中武器装备名称的共指词。

关键词: 维基百科, 俄语, 武器装备, 共指词表

Abstract: In the automatic processing of Russian military texts, it is necessary to deal with coreference resolution of weapon names. In order to solve this problem, the coreference words and the patterns were taken out from the Infobox of entries in Wikipedia by using the pattern matching method. These words and patterns were combined into new patterns. And then these new patterns were iteratively put back into the entries of Wikipedia to find more coreference words. A coreference wordlist was generated as the result of this method. The correct rate of result shows that the pattern matching method can accurately and efficiently identify coreference words of weapon names from the entries of Russian Wikipedia.

Key words: Russian, coreference wordlist, Wikipedia, weapon

中图分类号: 

  • TP391
[1] Гершензон Л М, Ножов И М, Панкратов Д В. Система извлечения и поиска структурированной информации из больших текстовых массивов СМИ. Архитектурные и лингвистические особенности[J]. Компьютерная лингвистика и интеллектуальные технологии: труды Международного семинара Диалог, 2005:97-101.
[2] Серый А С, Сидорова Е А. Поиск референциальных отношений между информационными объектами в процессе автоматического анализа документов (Searching Referential Relationships between the Information Objects During the Automatic Document Processing)[C]//RCDL. 2012:160-166.
[3] Кормалев Д А, Куршев Е П, Сулейманова Е А, et al. Архитектура инструментальных средств систем извлечения информации из текстов[C]// Труды международной конференции “Программные системы: теория и приложения”, Переславль-Залесский, М.:Физматлит, 2004, 2:49-70.
[4] Толпегин П В. Автоматическое разрешение кореференции местоимений третьего лица русскоязычных текстов[D]. канд.т.наук. Москва, 2008.
[5] 王厚峰. 指代消解的基本方法和实现技术[J]. 中文信息学报, 2002, 16(6):9-17. WANG Houfeng. Survey: computational models and technologies in anaphora resolution [J]. Journal of Chinese Information Processing, 2002, 16(6):9-17.
[6] 张牧宇, 黎耀炳, 秦兵, 等. 基于中心语匹配的共指消解[J]. 中文信息学报, 2011, 25(3): 3-8. ZHANG Muyu, LI Yaobing, QIN Bing, et al. Coreference resolution based on head match [J]. Journal of Chinese Information Processing, 2011, 25(3): 3-8.
[7] 郎君, 忻舟, 秦兵, 等. 集成多种背景语义知识的共指消解[J]. 中文信息学报, 2009, 23(3): 3-9. LANG Jun, XIN Zhou, QIN Bing, et al. Coreference resolution with integrated multiple background semantic knowledge[J]. Journal of Chinese Information Processing, 2009, 23(3): 3-9.
[8] 李元龙, 周俊生, 陈家骏. 一种基于关联聚类的汉语共指消解方法[J]. 计算机科学, 2008, 34(12):216-218. LI Yuanlong, ZHOU Junsheng, CHEN Jiajun. Applying correlation clustering to Chinese noun phrase coreference resolution[J]. Computer Science, 2008, 34(12): 216-218.
[9] 李斌, 马宁, 蒋平, 等. 维基百科中的实体关系抽取研究[J]. 信息系统工程, 2011(5):142-144.
LI Bin, MA Ning, JIANG Ping, et al. Study on the entity relation extraction in Wikipedia[J]. China CIO News, 2011(5):142-144.
[1] 原伟,唐亮,易绵竹. 基于本体的俄文新闻话题检测设计与实现[J]. 山东大学学报(理学版), 2018, 53(9): 49-54.
[2] 原伟,易绵竹. 基于维基百科的俄汉可比语料库构建及可比度计算[J]. 山东大学学报(理学版), 2017, 52(9): 1-6.
[3] 王彤,马延周,易绵竹. 基于DTW的俄语短指令语音识别[J]. 山东大学学报(理学版), 2017, 52(11): 29-36.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!