%A MA Cheng-long, JIANG Ya-song, LI Yan-ling, ZHANG Yan, YAN Yong-hong %T Short text classification based on word embedding similarity %0 Journal Article %D 2014 %J JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) %R 10.6040/j.issn.1671-9352.3.2014.295 %P 18-22 %V 49 %N 12 %U {http://lxbwk.njournal.sdu.edu.cn/CN/abstract/article_2044.shtml} %8 2014-12-20 %X As the short length of the Web short text and less shared words, a lot of out of vocabulary (OOV) words would appear, and these words make the task of text classification more difficult. To solve this problem, a new general framework based on word embedding similarity was proposed. First, get the word embedding file with unsupervised learning method based on unlabeled data. Second, extend the OOVs with the similar words in training data through computing the similarities of different word embeddings. The comparison with the baseline system shows that the proposed method gets better 1%-2% rate and outperforms more 10% rate on small training data set.