J4

• Articles • Previous Articles     Next Articles

New word identification based on largescale corpus

SHI Shui-cai,YU Hong-kui,LV Xue-qiang,LI Yu-qin   

  1. Chinese Information Processing and Research Center, Beijing Information Science & Technology Univ.,
  • Received:2006-03-29 Revised:1900-01-01 Online:2006-10-24 Published:2006-10-24
  • Contact: SHI Shui-cai

Abstract: String frequent static, sub string reduction and several filtering method are used to analyze one set Chinese new word mining system and identify new word by using character, word and N-gram dictionary based on statistic largescale corpus.With the system based on those methods, new word without length and domain limit can be identified.

Key words: corpus , catchword, new word

[1] . Online shopping customer service dialogue annotation and analysis [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(7): 66-73.
[2] TANG Liang, LI Qian, XU Hong-bo, YI Mian-zhu. Chinese-Japanese multi-word phrase extraction and alignment based on multi-strategy filtering [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(09): 21-28.
[3] ZHOU Chao, YAN Xin, YU Zheng-tao, HONG Xu-dong, XIAN Yan-tuan. Weibo new word recognition combining frequency characteristic and accessor variety [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(03): 6-10.
[4] TANG Bo, CHEN Guang, WANG Xing-ya, WANG Fei, CHEN Xiao-hui. Analysis on new word detection and sentiment orientation in Micro-blog [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(01): 20-25.
[5] ZHANG Liang,WANG Hai-mei,HUANG He-yan,ZHANG Xiao-fei . Chinese question answering systemoriented Chinese parsing [J]. J4, 2006, 41(3): 30-33 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!