您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2016, Vol. 51 ›› Issue (8): 98-104.doi: 10.6040/j.issn.1671-9352.0.2015.435

• • 上一篇    下一篇

代价敏感属性约简的自适应分治算法

黄伟婷1,赵红2,祝峰2   

  1. 1.闽南师范大学计算机学院, 福建 漳州 363000;2.闽南师范大学粒计算及其应用重点实验室, 福建 漳州 363000
  • 收稿日期:2015-09-14 出版日期:2016-08-20 发布日期:2016-08-08
  • 作者简介:黄伟婷(1977— ),女,硕士,讲师,研究方向为粒计算和代价敏感学习. E-mail:weitinghuang92@163.com
  • 基金资助:
    国家自然科学基金资助项目(61379049;61379089);漳州市自然科学资助基金资助项目(ZZ2016J35)

Adaptive divide and conquer algorithm for cost-sensitive attribute reduction

HUANG Wei-ting1, ZHAO Hong2, ZHU William2   

  1. 1. School of Computing, Minnan Normal University, Zhangzhou 363000, Fujian, China;
    2. Lab of Granular Computing, Minnan Normal University, Zhangzhou 363000, Fujian, China
  • Received:2015-09-14 Online:2016-08-20 Published:2016-08-08

摘要: 代价敏感属性约简问题作为经典属性约简问题的自然扩展,将代价引入数据,使得属性约简问题更加具有现实意义。文章基于分治思想,先按列将数据集拆分为若干个互不相交的子数据集,然后对各子数据集进行约简,并把约简后的子数据集多路合并。依次继续执行约简和合并操作,最终得到最小测试代价约简。每个子数据集的大小及子数据集的总个数自适应于各个数据集的规模而非固定不变。为验证算法的有效性,选择四个UCI标准数据集进行实验,并与其他算法进行结果对比。实验结果表明,该算法能在较短时间内获得可接受的结果,更适应实际问题的需要。

关键词: 粒计算, 代价敏感, 属性约简, 自适应分治, 粗糙集

Abstract: Cost-sensitive attribute reduction problem is the natural extension of classical attribute reduction, and it is more practical than the classical one by introducing cost. Based on divide and conquer thought, this paper proposes a new algorithm to deal with cost-sensitive attribute reduction. Firstly, the dataset is splitted into disjoint sub-datasets according to the number of the column. Then some sub-datasets are merged after backtracking reduction on each sub-dataset. Finally, it continues reducting and merging, and gets minimal test cost reduction. The size of the sub-datasets and the number of the sub-datasets are adaptive to the scale of the dataset rather than fixed. This algorithm is tested on four UCI datasets to verify its effectiveness. Compared with other algorithms, the experimental results show that the proposed algorithm can provide the efficient solution in a relatively short time.

Key words: cost-sensitive, adaptive divide and conquer, granular computing, attribute reduction, rough sets

中图分类号: 

  • TP18
[1] PAWLAK Z. Rough sets[J]. International Journal of Computer and Information Sciences, 1982, 11(5):341-356.
[2] TURNEY P D. Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm[J]. Journal of Artificial Intelligence Research, 1995, 2(1):369-409.
[3] 张伶卫, 万文强. 基于云计算平台的代价敏感集成学习算法研究[J]. 山东大学学报(工学版), 2012, 42(4):19-23. ZHANG Lingwei, WAN Wenqiang. Study on the cost-sensitive ensemble learning algorithm based on the cloud computing platform[J]. Journal of Shandong University(Engineering Science), 2012, 42(4):19-23.
[4] 蒋盛益, 谢照青, 余雯. 基于代价敏感的朴素贝叶斯不平衡数据分类研究[J]. 计算机研究与发展, 2011, 48:387-390. JIANG Shengyi, XIE Zhaoqing, YU Wen. Naive Bayes classification algorithm based on cost sensitive for imbalanced data distribution[J]. Journal of Computer Research and Development, 2011, 48:387-390.
[5] JIANG Liangxiao, LI Chaoqun, WANG Shasha. Cost-sensitive Bayesian network classifiers[J]. Pattern Recognition Letters, 2014, 45(8):211-216.
[6] 李宁, 郭乔进, 谢俊元, 等. 基于Cascade结构的代价敏感的医学图像ROI检测方法[J]. 模式识别与人工智能, 2010, 23(2):228-234. LI Ning, GUO Qiaojin, XIE Junyuan, et al. Cost-sensitive ROI detection method for medical images based on cascade architecture[J]. Pattern Recognition and Artificial Intelligence, 2010, 23(2):228-234.
[7] LI Jingkuan, MIN Fan, ZHU William. Fast randomized algorithm for minimal test cost attribute reduction[J]. Proc ICRITO, 2013, 6:12-17.
[8] HE Huaping, MIN Fan. Accumulated cost based test-cost-sensitive attribute reduction[M]. Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. Berlin: Springer, 2011: 244-247.
[9] 刘春英. 基于关联度的代价敏感决策树生成方法[J]. 长春工业大学学报(自然科学版), 2013, 34(2):218-222. LIU Chunying. A method of generating cost-sensitive decision tree based on correlation degree[J]. Journal of Changchun University of Technology(Natural Science Edition), 2013, 34(2):218-222.
[10] JIA Xiuyi, LIAO Wenhe, TANG Zhenmin, et al. Minimum cost attribute reduction in decision-theoretic rough set models[J]. Information Sciences an International Journal, 2013, 219(1):151-167.
[11] LIAO Shujiao, LIU Jiabin, MIN Fan, et al. Minimal-test-cost reduct problem on neighborhood decision systems[J]. Journal of Information & Computational Science, 2012, 9(14):4083-4098.
[12] ZHAO Hong, MIN Fan, ZHU William. Test-cost-sensitive attribute reduction based on neighborhood rough set[C]. IEEE International Conference on Granular Computing, 2011:802-806.
[13] MIN Fan, ZHU William. Attribute reduction on data with error ranges and test costs[J]. Information Sciences, 2012, 211:48-67.
[14] MIN Fan, ZHU William. Minimal cost attribute reduction through backtracking[J]. Communications in Computer & Information Science, 2011, 258:100-107.
[15] MIN Fan, HE Huaping, QIAN Yuhua, et al. Test-cost-sensitive attribute reduction[J]. Information Sciences, 2011, 181(22):4928-4942.
[16] PAN Guiying, MIN Fan, ZHU William. A genetic algorithm to the minimal test cost reduct problem[C]. IEEE International Conference on Granular Computing, 2011: 539-544.
[17] LIU Jiabin, MIN Fan, LIAO Shujiao, et al. Test cost constraint attribute reduction through a genetic approach[J]. Journal of Information & Computational Science, 2013, 10(3):839-849.
[18] XU Zilong, MIN Fan, LIU Jiabin, et al. Ant colony optimization to minimal test cost reduction[C]. IEEE International Conference on Granular Computing, 2012, 40(6):585-590.
[19] DERVIS K, AKAY B. A comparative study of artificial bee colony algorithm[J]. Applied Mathematics and Computation, 2009, 214(1):108-132.
[20] PINAR C, BESDOK E. A conceptual comparison of the cuckoo search, particle swarm optimization, differential evolution and artificial bee colony algorithms[J]. Artificial Intelligence Review, 2013, 39(4):315-346.
[1] 李同军,黄家文,吴伟志. 基于相似关系的不完备形式背景属性约简[J]. 山东大学学报(理学版), 2018, 53(8): 9-16.
[2] 左芝翠,张贤勇,莫智文,冯林. 基于决策分类的分块差别矩阵及其求核算法[J]. 山东大学学报(理学版), 2018, 53(8): 25-33.
[3] 张恩胜. 区间集概念格属性约简的组成与结构[J]. 山东大学学报(理学版), 2018, 53(8): 17-24.
[4] 李丽,管涛,林和. 基于泛系算子的泛系混合并联粗糙集模型[J]. 山东大学学报(理学版), 2017, 52(7): 22-29.
[5] 李金海,吴伟志. 形式概念分析的粒计算方法及其研究展望[J]. 山东大学学报(理学版), 2017, 52(7): 1-12.
[6] 胡谦,米据生,李磊军. 多粒度模糊粗糙近似算子的信任结构与属性约简[J]. 山东大学学报(理学版), 2017, 52(7): 30-36.
[7] 黄天意,祝峰. 基于流形学习的代价敏感特征选择[J]. 山东大学学报(理学版), 2017, 52(3): 91-96.
[8] 汪小燕,沈家兰,申元霞. 基于加权粒度和优势关系的程度多粒度粗糙集[J]. 山东大学学报(理学版), 2017, 52(3): 97-104.
[9] 陈雪,魏玲,钱婷. 基于AE-概念格的决策形式背景属性约简[J]. 山东大学学报(理学版), 2017, 52(12): 95-103.
[10] 翟俊海, 张垚, 王熙照. 相容粗糙模糊集模型[J]. 山东大学学报(理学版), 2014, 49(08): 73-79.
[11] 马媛媛, 孟慧丽, 徐久成, 朱玛. 基于粒计算的正态粒集下的格贴近度[J]. 山东大学学报(理学版), 2014, 49(08): 107-110.
[12] 罗海燕, 吕萍, 刘林忠, 杨洵. 云环境下基于模糊粗糙AHP的企业信任综合评估[J]. 山东大学学报(理学版), 2014, 49(08): 111-117.
[13] 张里博, 李华雄, 周献中, 黄兵. 人脸识别中的多粒度代价敏感三支决策[J]. 山东大学学报(理学版), 2014, 49(08): 48-57.
[14] 吴正江, 刘永利, 高岩. 拟单层覆盖上的覆盖粗糙集族[J]. 山东大学学报(理学版), 2014, 49(08): 6-14.
[15] 林姿琼, 王敬前, 祝峰. 矩阵方法计算覆盖粗糙集中最小、最大描述[J]. 山东大学学报(理学版), 2014, 49(08): 97-101.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!