您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2016, Vol. 51 ›› Issue (8): 98-104.doi: 10.6040/j.issn.1671-9352.0.2015.435

• • 上一篇    下一篇

代价敏感属性约简的自适应分治算法

黄伟婷1,赵红2,祝峰2   

  1. 1.闽南师范大学计算机学院, 福建 漳州 363000;2.闽南师范大学粒计算及其应用重点实验室, 福建 漳州 363000
  • 收稿日期:2015-09-14 出版日期:2016-08-20 发布日期:2016-08-08
  • 作者简介:黄伟婷(1977— ),女,硕士,讲师,研究方向为粒计算和代价敏感学习. E-mail:weitinghuang92@163.com
  • 基金资助:
    国家自然科学基金资助项目(61379049;61379089);漳州市自然科学资助基金资助项目(ZZ2016J35)

Adaptive divide and conquer algorithm for cost-sensitive attribute reduction

HUANG Wei-ting1, ZHAO Hong2, ZHU William2   

  1. 1. School of Computing, Minnan Normal University, Zhangzhou 363000, Fujian, China;
    2. Lab of Granular Computing, Minnan Normal University, Zhangzhou 363000, Fujian, China
  • Received:2015-09-14 Online:2016-08-20 Published:2016-08-08

摘要: 代价敏感属性约简问题作为经典属性约简问题的自然扩展,将代价引入数据,使得属性约简问题更加具有现实意义。文章基于分治思想,先按列将数据集拆分为若干个互不相交的子数据集,然后对各子数据集进行约简,并把约简后的子数据集多路合并。依次继续执行约简和合并操作,最终得到最小测试代价约简。每个子数据集的大小及子数据集的总个数自适应于各个数据集的规模而非固定不变。为验证算法的有效性,选择四个UCI标准数据集进行实验,并与其他算法进行结果对比。实验结果表明,该算法能在较短时间内获得可接受的结果,更适应实际问题的需要。

关键词: 粒计算, 代价敏感, 属性约简, 自适应分治, 粗糙集

Abstract: Cost-sensitive attribute reduction problem is the natural extension of classical attribute reduction, and it is more practical than the classical one by introducing cost. Based on divide and conquer thought, this paper proposes a new algorithm to deal with cost-sensitive attribute reduction. Firstly, the dataset is splitted into disjoint sub-datasets according to the number of the column. Then some sub-datasets are merged after backtracking reduction on each sub-dataset. Finally, it continues reducting and merging, and gets minimal test cost reduction. The size of the sub-datasets and the number of the sub-datasets are adaptive to the scale of the dataset rather than fixed. This algorithm is tested on four UCI datasets to verify its effectiveness. Compared with other algorithms, the experimental results show that the proposed algorithm can provide the efficient solution in a relatively short time.

Key words: cost-sensitive, adaptive divide and conquer, granular computing, attribute reduction, rough sets

中图分类号: 

  • TP18
[1] PAWLAK Z. Rough sets[J]. International Journal of Computer and Information Sciences, 1982, 11(5):341-356.
[2] TURNEY P D. Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm[J]. Journal of Artificial Intelligence Research, 1995, 2(1):369-409.
[3] 张伶卫, 万文强. 基于云计算平台的代价敏感集成学习算法研究[J]. 山东大学学报(工学版), 2012, 42(4):19-23. ZHANG Lingwei, WAN Wenqiang. Study on the cost-sensitive ensemble learning algorithm based on the cloud computing platform[J]. Journal of Shandong University(Engineering Science), 2012, 42(4):19-23.
[4] 蒋盛益, 谢照青, 余雯. 基于代价敏感的朴素贝叶斯不平衡数据分类研究[J]. 计算机研究与发展, 2011, 48:387-390. JIANG Shengyi, XIE Zhaoqing, YU Wen. Naive Bayes classification algorithm based on cost sensitive for imbalanced data distribution[J]. Journal of Computer Research and Development, 2011, 48:387-390.
[5] JIANG Liangxiao, LI Chaoqun, WANG Shasha. Cost-sensitive Bayesian network classifiers[J]. Pattern Recognition Letters, 2014, 45(8):211-216.
[6] 李宁, 郭乔进, 谢俊元, 等. 基于Cascade结构的代价敏感的医学图像ROI检测方法[J]. 模式识别与人工智能, 2010, 23(2):228-234. LI Ning, GUO Qiaojin, XIE Junyuan, et al. Cost-sensitive ROI detection method for medical images based on cascade architecture[J]. Pattern Recognition and Artificial Intelligence, 2010, 23(2):228-234.
[7] LI Jingkuan, MIN Fan, ZHU William. Fast randomized algorithm for minimal test cost attribute reduction[J]. Proc ICRITO, 2013, 6:12-17.
[8] HE Huaping, MIN Fan. Accumulated cost based test-cost-sensitive attribute reduction[M]. Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. Berlin: Springer, 2011: 244-247.
[9] 刘春英. 基于关联度的代价敏感决策树生成方法[J]. 长春工业大学学报(自然科学版), 2013, 34(2):218-222. LIU Chunying. A method of generating cost-sensitive decision tree based on correlation degree[J]. Journal of Changchun University of Technology(Natural Science Edition), 2013, 34(2):218-222.
[10] JIA Xiuyi, LIAO Wenhe, TANG Zhenmin, et al. Minimum cost attribute reduction in decision-theoretic rough set models[J]. Information Sciences an International Journal, 2013, 219(1):151-167.
[11] LIAO Shujiao, LIU Jiabin, MIN Fan, et al. Minimal-test-cost reduct problem on neighborhood decision systems[J]. Journal of Information & Computational Science, 2012, 9(14):4083-4098.
[12] ZHAO Hong, MIN Fan, ZHU William. Test-cost-sensitive attribute reduction based on neighborhood rough set[C]. IEEE International Conference on Granular Computing, 2011:802-806.
[13] MIN Fan, ZHU William. Attribute reduction on data with error ranges and test costs[J]. Information Sciences, 2012, 211:48-67.
[14] MIN Fan, ZHU William. Minimal cost attribute reduction through backtracking[J]. Communications in Computer & Information Science, 2011, 258:100-107.
[15] MIN Fan, HE Huaping, QIAN Yuhua, et al. Test-cost-sensitive attribute reduction[J]. Information Sciences, 2011, 181(22):4928-4942.
[16] PAN Guiying, MIN Fan, ZHU William. A genetic algorithm to the minimal test cost reduct problem[C]. IEEE International Conference on Granular Computing, 2011: 539-544.
[17] LIU Jiabin, MIN Fan, LIAO Shujiao, et al. Test cost constraint attribute reduction through a genetic approach[J]. Journal of Information & Computational Science, 2013, 10(3):839-849.
[18] XU Zilong, MIN Fan, LIU Jiabin, et al. Ant colony optimization to minimal test cost reduction[C]. IEEE International Conference on Granular Computing, 2012, 40(6):585-590.
[19] DERVIS K, AKAY B. A comparative study of artificial bee colony algorithm[J]. Applied Mathematics and Computation, 2009, 214(1):108-132.
[20] PINAR C, BESDOK E. A conceptual comparison of the cuckoo search, particle swarm optimization, differential evolution and artificial bee colony algorithms[J]. Artificial Intelligence Review, 2013, 39(4):315-346.
[1] 何怡,邵亚斌,冯慧,郭瑞莲. 基于快速超粒方生成算法的分类器模型[J]. 《山东大学学报(理学版)》, 2026, 61(5): 65-78.
[2] 邓波军,吴南海,陈玉明,吴克寿,赖荣. 旋转粒支持向量机分类器算法[J]. 《山东大学学报(理学版)》, 2026, 61(5): 102-113.
[3] 李文焱,李丽红,王洪欣. 基于知识度量的模糊粗糙c-均值算法[J]. 《山东大学学报(理学版)》, 2026, 61(1): 49-64.
[4] 张光旭,姚卫. 基于二型模糊预序的模糊粗糙集模型[J]. 《山东大学学报(理学版)》, 2026, 61(1): 85-93.
[5] 华有霖,邵亚斌,朱学勤. 基于粒球计算的多粒度支持向量回归算法[J]. 《山东大学学报(理学版)》, 2025, 60(7): 104-115.
[6] 周缪娟,黄韩亮,张纪平,李进金. 基于FT-粗糙集构建知识结构与寻找学习路径方法[J]. 《山东大学学报(理学版)》, 2025, 60(7): 116-130.
[7] 李心如,李令强,贾成昭. 新型多粒度变精度(*,·)-模糊粗糙集[J]. 《山东大学学报(理学版)》, 2025, 60(7): 131-142.
[8] 吴海,牛娇娇,铁文彦,左建坤. 基于粒概念网络的概念格构造方法[J]. 《山东大学学报(理学版)》, 2025, 60(12): 21-31.
[9] 杨志强,冯山,尹伊,吴慧佳. 一种多因素融合的高效离群点检测方法[J]. 《山东大学学报(理学版)》, 2024, 59(8): 118-126.
[10] 陈玉明,郑光宇,焦娜. 基于粒神经网络的多标签学习[J]. 《山东大学学报(理学版)》, 2024, 59(5): 1-11.
[11] 郑晨颖,陈颖悦,侯贤宇,江连吉,廖亮. 一种邻域粒的模糊C均值聚类算法[J]. 《山东大学学报(理学版)》, 2024, 59(5): 35-44.
[12] 宋苏洋,叶军,曾广财,孙清. 基于优化可辨识矩阵的多粒度粗糙集属性约简算法[J]. 《山东大学学报(理学版)》, 2024, 59(5): 52-62.
[13] 高贺飞,李艳,王硕. 基于邻域粗糙集的偏标记特征选择[J]. 《山东大学学报(理学版)》, 2024, 59(5): 100-113.
[14] 方逢祺,吴伟志. 决策集值系统中的知识约简[J]. 《山东大学学报(理学版)》, 2024, 59(5): 82-89.
[15] 温欣,李德玉. 基于属性加权的ML-KNN方法[J]. 《山东大学学报(理学版)》, 2024, 59(3): 107-117.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!