您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

J4 ›› 2012, Vol. 47 ›› Issue (5): 59-62.

• 电子技术与信息 • 上一篇    下一篇

一种基于信息熵数据修剪的支持向量机:EB-SVM

曹林林1,2,张化祥1,2*,王至超1,2   

  1. 1.山东师范大学信息科学与工程学院, 山东 济南 250014;
    2.山东省分布式计算机软件新技术重点实验室, 山东 济南 250014
  • 收稿日期:2011-11-30 出版日期:2012-05-20 发布日期:2012-06-01
  • 通讯作者: 张化祥(1966- ),男,教授,博士生导师,主要研究方向为模式识别、进化计算等. Email:huaxzhang@163.com
  • 作者简介:曹林林(1987- ),女,硕士研究生,主要研究方向为数据挖掘. Email:tiankong0418@163.com
  • 基金资助:

    国家自然科学基金资助项目(61170145);国家高等学校博士点专项基金资助项目(20113704110001);山东省自然科学基金和科技攻关计划项目(ZR2010FM021, 2008B0026, 2010G0020115)

EB-SVM: support vector machine based data pruning with informatior entropy

CAO Lin-lin1,2, ZHANG Hua-xiang1,2*, WANG Zhi-chao1,2   

  1. 1. Department of Information Science and Engineering, Shandong Normal University, Jinan 250014, Shandong, China;
    2. Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology,
    Jinan 250014, Shandong, China
  • Received:2011-11-30 Online:2012-05-20 Published:2012-06-01

摘要:

支持向量机在处理分类问题时,如果两类数据重叠严重会造成分类器过学习,降低泛化性能。为此提出了一种基于信息熵的数据修剪支持向量机EB-SVM(entropy based-support vector machine),其主要思想是通过计算样例信息熵删除部分边缘数据和边界处混淆程度较高的样例以及噪声数据,用较少的训练样例学习SVM分类器。实验结果表明,该方法能够有效提高SVM的泛化性能。

关键词: 信息熵;数据修剪;支持向量机;分类;数据分布

Abstract:

The generalization performance of SVM applied to classification problems will be reduced if different class data are seriously overlapped. A new approach EBSVM (entropy based support vector machine) is presented to prune data based on the concept of the information entropy for support vector machine. The EB-SVM employs the information entropies of the training data to remove the patterns far from the boundaries and delete the noise and overlapped instances close  to the boundaries, and then uses the pruned dataset to construct a SVM classifier. Experimental results show the EB-SVM takes less time than SVM and improves the classification accuracy.

Key words: information entropy; data pruning; support vector machine; classification; data distribution

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!