您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

J4 ›› 2010, Vol. 45 ›› Issue (7): 1-6.

• 论文 •    下一篇

一种新的密度加权粗糙K-均值聚类算法

谢娟英1, 2,张琰1,谢维信2, 3,高新波2   

  1. 1. 陕西师范大学计算机科学学院, 陕西 西安 710062;
     2. 西安电子科技大学电子工程学院, 陕西 西安 710071;
    3. 深圳大学信息工程学院, 广东 深圳 518060
  • 收稿日期:2010-04-02 出版日期:2010-07-16 发布日期:2010-09-06
  • 作者简介:谢娟英(1971-),女,副教授,硕士生导师,主要研究方向为智能信息处理、模式识别、机器学习等.Email: xiejuany@snnu.edu.cn
  • 基金资助:

    中央高校基本科研业务费专项资金重点资助项目(GK200901006);陕西省自然科学基础研究计划项目(2010JM3004)

A novel rough K-means clustering algorithm based on the weight of density

XIE Juan-ying1, 2, ZHANG Yan1, XIE Wei-xin2, 3, GAO Xin-bo2   

  1. 1. School of Computer Science, Shaanxi Normal University, Xi’an 710062, Shaanxi, China;
    2. School of Electronic Engineering, Xidian University,  Xi’an 710071,  Shaanxi, China;
    3. School of Information Engineering, Shenzhen University, Shenzhen 518060, Guangdong, China
  • Received:2010-04-02 Online:2010-07-16 Published:2010-09-06

摘要:

为了克服粗糙K-均值聚类算法初始聚类中心点随机选取,以及样本密度函数定义所存在的缺陷,基于数据对象所在区域的样本点密集程度,定义了新的样本密度函数,选择相互距离最远的K个高密度样本点作为初始聚类中心,克服了现有粗糙K-均值聚类算法的初始中心随机选取的缺点,从而使得聚类结果更接近于全局最优解。同时在类均值计算中,对每个样本根据定义的密度赋以不同的权重,得到不受噪音点影响的更合理的质心。利用UCI机器学习数据库的6组数据集,以及随机生成的带有噪音点的人工模拟数据集进行测试,证明本文算法具有更好的聚类效果,而且对噪音数据有很强的抗干扰性能。

关键词: 聚类算法;粗糙K-均值;聚类中心;加权;密度

Abstract:

 A novel rough K-means clustering algorithm was presented  based on the weight of exemplar density to overcome the drawback of selecting initial seeds randomly of available rough K-means algorithms. A new density function was defined for each sample according to the denseness of samples around it without any arbitrary parameter, and the top K samples with higher density and far away from each other were selected as initial centers of rough K-means clustering algorithm. Further more the new weight was defined for each exemplar according to the value of the new density function, so that the better could croids of each cluster could be calculated out without influenced by noisy data. Experiments on six UCI data sets and on synthetically geterated  data sets  with noise points proved that our algorithm got a better clustering result, and had a strong anti-interference performance for noise data.
 

Key words: clustering algorithm; rough K-means; clustering center; weight; density

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!