一种用于文本聚类的改进k-means算法

• Articles • Previous Articles Next Articles

An improved k-means algorithm for document clustering

SUO Hong-guang¹^,²,WANG Yu-wei²

1. School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;2. School of Computer ＆ Communication Engineering,China University of Petroleum, Dongying 257061, Shandong, China

Received:1900-01-01 Revised:1900-01-01 Online:2006-10-24 Published:2006-10-24
Contact: SUO Hong-guang

Abstract

Abstract: The k-means algorithm is a popular method for document clustering, but it often gets stuck at a local maximum far from the optimal solution. A procedure based on local search was used to improve this algorithm. The formula about object function change was also deduced, which can be used to again partition the clustering. This procedure makes appropriate iterations to enlarge the search space. Theory analysis and experimental results show that the improved algorithm efficiently improves k-means clustering and its computation is also linear in the size of document collection.

Key words: local iteration , vector space model, k-means, document clustering

CLC Number:

TP391

SUO Hong-guang,WANG Yu-wei . An improved k-means algorithm for document clustering[J].J4, 2008, 43(1): 60-64 .

References

Related Articles 4

[1]	FENG Xin-ying1,2, JI Hua1,2, ZHANG Hua-xiang1,2. Multi-label RBF neural networks learning algorithm based on clustering optimization [J]. J4, 2012, 47(5): 63-67.
[2]	XIE Juan-ying1, 2, ZHANG Yan1, XIE Wei-xin2, 3, GAO Xin-bo2. A novel rough K-means clustering algorithm based on the weight of density [J]. J4, 2010, 45(7): 1-6.
[3]	ZHANG Xue-feng1, LIU Peng1,2. An improved K-means algorithm by weighted distance based on maximum between-cluster variation [J]. J4, 2010, 45(7): 28-33.
[4]	WANG Wei-dong,SONG Dan,SONG Ren-jie . Web news retrieval based on splited vector space model [J]. J4, 2006, 41(3): 135-138 .

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!