JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2020, Vol. 55 ›› Issue (3): 58-69.doi: 10.6040/j.issn.1671-9352.1.2019.154

Previous Articles     Next Articles

Clustering method for multi-label symbolic value partition

Liu-ying WEN*(),Wei YUAN   

  1. School of Computer Science, Southwest Petroleum University, Chengdu 610500, Sichuan, China
  • Received:2019-10-31 Online:2020-03-20 Published:2020-03-27
  • Contact: Liu-ying WEN E-mail:wenliuying1983@163.com

Abstract:

A clustering method for multi-label symbolic value partition (CMSVP) is proposed. First, the label ranking and K-means algorithms are employed to cluster the original label information. Then, an undirected weighted graph is constructed for each attribute. Each node represents an attribute value, and the weight of each edge represents the similarity between the nodes. Finally, random walks are performed on all undirected weighted graphs to obtain a clustering scheme of attribute values. The experiment is conducted on six multi-label data sets. The results show that the CMSVP algorithm can improve the classification performance of data while effectively compressing the data.

Key words: symbolic value partition, clustering, random walk, undirected weighted graph, multi-label

CLC Number: 

  • TP391

Table 1

Example of a multi-label decision system D"

U A L
a1 a2 l1 l2 l3
x1 0 6 1 0 1
x2 1 1 0 0 1
x3 5 3 0 1 0
x4 2 1 1 1 0
x5 1 2 1 0 1
x6 3 0 0 1 1
x7 4 5 1 1 1
x8 6 4 1 1 0

Table 2

New multi-label decision system DP"

U AP L
a1P a2P l1 l2 l3
x1 {0, 1, 4} {2, 5, 6} 1 0 1
x2 {0, 1, 4} {0, 1} 0 0 1
x3 {5, 6} {3, 4} 0 1 0
x4 {2, 3} {0, 1} 1 1 0
x5 {0, 1, 4} {2, 5, 6} 1 0 1
x6 {2, 3} {0, 1} 0 1 1
x7 {0, 1, 4} {2, 5, 6} 1 1 1
x8 {5, 6} {3, 4} 1 1 0

Table 3

Definition of symbols"

符号 定义说明
N U中实例的个数
q L中标签的个数
M A中属性的个数
CN 记录相同属性值的个数
Dai 关于属性ai的决策子表
V′i Dai中去重后的属性值域
Pj V′i中第j个属性值的概率
Gi 属性ai的无向加权图
vt Gi中的节点t
MGi 关于Gi的相关转移矩阵

Table 4

New decision system D′"

U A L′
a1 a2 {l1, l3} {l2}
x1 0 6 1 0
x2 1 1 0 0
x3 5 3 0 1
x4 2 1 0 1
x5 1 2 1 0
x6 3 0 0 1
x7 4 5 1 1
x8 6 4 0 1

Fig.1

Graph construction for attribute a1"

Fig.2

Clustering process of G1"

Table 5

Time complexity of CMSVP"

步骤描述 复杂度
标签聚类 O(qN)
构建无向加权图 O(qMN2)
随机游走聚类 O(RMN2)
总复杂度 O((q+R)MN2)

Table 6

Dataset information"

名称 领域 对象数目 属性数目 标签数目
Birds Audio 645 260 19
Cal500 Music 502 68 174
Emotions Music 593 72 6
Flags Images 194 19 7
Scene Image 2 407 294 6
Yeast Biology 2 417 103 14

Table 7

Algorithm performance"

数据集 Rank 压缩率
原始 处理后
Birds 3 279 1 045 0.319
Cal500 1 145 435 0.380
Emotions 1 806 275 0.152
Flags 104 48 0.462
Scene 5 785 981 0.170
Yeast 1 968 368 0.187

Table 8

Run time of each algorithm  ms"

数据集 IG-BR IG-LP RF-BR RF-LP CMSVP
Birds 2 191 2 057 19 176 3 741 66 151
Cal500 1 675 1 840 28 034 1 983 505 633
Emotions 473 456 1 696 764 10 746
Flags 241 301 333 334 5 133
Scene 5 793 5 669 91 033 20 388 259 470
Yeast 1 501 1 433 70 213 9 185 269 810

Table 9

Time complexity of each algorithm"

算法 时间复杂度
RF-BR O(qN+qMN+q2M2)
RF-LP O(qN+qMN+q2M2)
IG-BR O(qN+M2+qMN)
IG-LP O(qN+M2+qMN)
CMSVP O((q+R)MN2)

Table 10

Performance comparison under RAkEL classifier"

Hamming Loss(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.051 7±0.007 0 0.036 4±0.009 6 0.050 3±0.005 6 0.052 6±0.006 1 0.050 8±0.006 2 0.049 5±0.005 9
Cal500 0.140 2±0.002 8 0.099 1±0.002 7 0.139 0±0.002 5 0.138 4±0.002 4 0.139 2±0.002 9 0.138 5±0.002 1
Emotions 0.238 4±0.015 3 0.172 1±0.026 6 0.271 8±0.031 8 0.260 0±0.034 0 0.248 5±0.034 7 0.239 3±0.036 7
Flags 0.271 9±0.032 5 0.217 6±0.053 7 0.257 9±0.024 4 0.260 8±0.022 8 0.265 2±0.027 7 0.265 2±0.027 7
Scene 0.137 3±0.009 5 0.146 5±0.011 5 0.158 8±0.007 4 0.158 1±0.007 1 0.159 5±0.009 3 0.159 2±0.008 3
Yeast 0.244 7±0.011 4 0.172 2±0.009 3 0.224 8±0.010 9 0.223 4±0.011 6 0.225 1±0.012 0 0.225 2±0.011 8
Ranking Loss(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.231 1±0.027 8 0.112 5±0.030 8 0.215 7±0.020 1 0.244 3±0.026 6 0.239 5±0.025 4 0.212 1±0.019 7
Cal500 0.422 8±0.010 1 0.434 1±0.007 0 0.413 2±0.009 6 0.417 4±0.008 9 0.395 5±0.007 4 0.413 9±0.009 6
Emotions 0.227 2±0.026 5 0.100 8±0.027 5 0.288 3±0.043 6 0.284 0±0.049 3 0.262 4±0.038 9 0.261 9±0.043 0
Flags 0.240 4±0.055 1 0.233 7±0.075 8 0.273 3±0.048 4 0.271 4±0.050 1 0.276 9±0.049 8 0.276 9±0.049 8
Scene 0.172 2±0.014 9 0.155 7±0.026 8 0.229 3±0.025 0 0.233 0±0.019 2 0.232 1±0.014 6 0.232 2±0.016 0
Yeast 0.241 7±0.015 7 0.185 4±0.015 1 0.271 8±0.014 0 0.278 8±0.015 2 0.262 3±0.017 6 0.262 3±0.019 5
One Error(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.820 3±0.044 8 0.877 5±0.052 5 0.773 6±0.043 6 0.835 5±0.045 3 0.818 4±0.045 0 0.765 8±0.061 7
Cal500 0.266 7±0.059 7 0.591 2±0.086 4 0.205 0±0.067 8 0.207 1±0.072 7 0.258 8±0.056 9 0.219 1±0.077 8
Emotions 0.377 8±0.060 7 0.495 9±0.049 3 0.438 6±0.063 2 0.438 4±0.063 6 0.396 4±0.063 2 0.386 3±0.072 4
Flags 0.257 1±0.098 5 0.412 6±0.058 1 0.292 9±0.095 2 0.277 6±0.104 8 0.283 2±0.083 8 0.283 2±0.083 8
Scene 0.395 9±0.033 1 0.719 2±0.032 7 0.491 9±0.036 0 0.491 5±0.026 7 0.489 8±0.026 3 0.489 4±0.025 9
Yeast 0.295 0±0.043 1 0.304 5±0.027 2 0.260 2±0.032 4 0.256 9±0.027 7 0.260 2±0.026 8 0.257 3±0.027 1
Coverage(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 5.421 4±0.502 9 1.618 1±0.387 7 5.313 2±0.408 9 5.684 0±0.443 3 5.670 4±0.467 4 5.286 0±0.419 6
Cal500 168.929 2±0.555 4 108.805 6±1.413 3 168.893 1±0.535 3 168.872 0±0.741 1 168.843 5±0.676 0 168.798 3±0.743 8
Emotions 2.104 6±0.120 0 0.460 8±0.117 1 2.401 5±0.219 7 2.401 7±0.277 3 2.299 9±0.208 2 2.299 9±0.221 7
Flags 3.921 3±0.497 1 1.744 2±0.423 8 4.093 2±0.423 8 4.103 4±0.427 9 4.112 4±0.433 1 4.112 4±0.433 1
Scene 0.953 4±0.064 2 0.492 7±0.090 2 1.235 9±0.118 0 1.257 5±0.094 0 1.253 8±0.077 0 1.255 0±0.079 3
Yeast 7.989 9±0.300 9 3.022 1±0.173 1 8.561 3±0.263 2 8.670 0±0.310 3 8.398 4±0.349 9 8.349 6±0.403 4
Average Precision(↑)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.356 7±0.050 8 0.464 7±0.112 8 0.409 9±0.041 3 0.324 9±0.048 9 0.344 4±0.044 6 0.414 6±0.057 1
Cal500 0.363 8±0.011 2 0.193 8±0.011 3 0.376 8±0.011 6 0.374 0±0.011 3 0.385 2±0.011 4 0.375 1±0.011 3
Emotions 0.740 2±0.027 6 0.838 4±0.042 0 0.690 0±0.038 8 0.693 4±0.041 3 0.718 3±0.040 4 0.722 4±0.042 7
Flags 0.803 2±0.041 2 0.759 6±0.059 7 0.786 3±0.038 7 0.788 4±0.042 3 0.783 5±0.038 8 0.783 5±0.038 8
Scene 0.748 2±0.018 0 0.707 0±0.037 4 0.684 0±0.025 8 0.681 1±0.019 6 0.682 2±0.016 9 0.682 5±0.017 0
Yeast 0.691 5±0.022 0 0.728 1±0.016 7 0.693 1±0.019 2 0.689 8±0.018 8 0.698 3±0.016 9 0.697 7±0.018 8

Table 11

Performance comparison under MLkNN classifier"

Hamming Loss(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.053 8±0.005 5 0.033 9±0.006 6 0.050 9±0.006 8 0.051 2±0.006 0 0.049 4±0.005 8 0.050 2±0.006 9
Cal500 0.138 4±0.003 1 0.097 9±0.002 9 0.138 3±0.002 8 0.138 6±0.002 6 0.138 8±0.002 8 0.138 4±0.003 0
Emotions 0.199 0±0.016 0 0.131 6±0.016 5 0.233 5±0.025 8 0.240 6±0.023 5 0.245 4±0.020 8 0.239 0±0.018 8
Flags 0.284 5±0.039 4 0.203 4±0.049 9 0.269 2±0.041 5 0.286 2±0.030 1 0.274 9±0.029 2 0.274 9±0.029 2
Scene 0.101 4±0.004 4 0.113 9±0.010 2 0.150 3±0.005 7 0.147 3±0.005 6 0.148 3±0.004 5 0.146 8±0.005 8
Yeast 0.207 9±0.013 6 0.151 1±0.010 5 0.217 6±0.012 1 0.221 8±0.008 3 0.216 3±0.011 8 0.213 5±0.011 5
Ranking Loss(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.117 6±0.023 7 0.060 7±0.017 2 0.082 3±0.019 8 0.083 3±0.020 0 0.084 7±0.022 5 0.084 1±0.021 4
Cal500 0.184 8±0.007 7 0.205 4±0.009 6 0.184 0±0.006 6 0.184 9±0.007 2 0.183 9±0.006 8 0.185 1±0.007 3
Emotions 0.165 3±0.024 4 0.056 3±0.017 4 0.213 5±0.026 0 0.224 5±0.031 2 0.224 3±0.017 4 0.225 0±0.021 5
Flags 0.214 5±0.044 5 0.228 9±0.078 9 0.198 5±0.036 9 0.206 3±0.028 7 0.196 6±0.041 1 0.196 6±0.041 1
Scene 0.100 8±0.012 4 0.084 0±0.010 6 0.190 0±0.012 4 0.180 9±0.014 9 0.172 0±0.014 5 0.172 7±0.017 4
Yeast 0.185 1±0.016 0 0.135 3±0.010 9 0.193 2±0.013 0 0.200 3±0.014 3 0.189 5±0.016 9 0.187 1±0.015 2
One Error(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.779 9±0.043 0 0.831 0±0.050 9 0.713 3±0.056 5 0.717 9±0.057 2 0.728 8±0.050 6 0.728 8±0.052 5
Cal500 0.119 3±0.065 0 0.651 4±0.040 8 0.119 3±0.061 7 0.119 3±0.065 5 0.119 2±0.069 9 0.117 3±0.069 8
Emotions 0.296 8±0.049 8 0.430 1±0.050 6 0.349 0±0.036 0 0.393 0±0.067 3 0.382 9±0.033 2 0.357 7±0.057 2
Flags 0.272 6±0.070 7 0.608 4±0.098 6 0.258 4±0.054 0 0.201 1±0.048 9 0.196 3±0.099 3 0.196 3±0.099 3
Scene 0.277 1±0.022 2 0.630 7±0.030 9 0.472 4±0.032 6 0.454 9±0.043 4 0.448 7±0.039 4 0.441 6±0.042 0
Yeast 0.248 7±0.027 0 0.254 0±0.029 8 0.254 9±0.030 1 0.250 3±0.030 5 0.244 9±0.024 4 0.247 0±0.031 0
Coverage(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 3.154 9±0.521 9 0.941 3±0.246 1 2.388 0±0.395 9 2.414 7±0.403 7 2.421 8±0.415 3 2.417 4±0.389 3
Cal500 129.451 6±2.226 7 74.915 2±2.068 7 129.751 7±1.949 0 129.841 6±1.706 7 129.482 6±1.941 1 129.806 3±1.883 3
Emotions 1.796 6±0.152 1 0.327 6±0.092 4 2.035 3±0.154 2 2.062 7±0.147 4 2.081 0±0.126 9 2.084 5±0.123 3
Flags 3.793 7±0.502 8 0.993 7±0.284 4 3.662 9±0.457 4 3.757 4±0.324 5 3.667 9±0.359 9 3.667 9±0.359 9
Scene 0.592 0±0.055 2 0.275 1±0.034 8 1.035 3±0.051 7 0.988 3±0.064 4 0.945 6±0.062 8 0.945 1±0.080 4
Yeast 6.523 3±0.256 9 2.441 4±0.155 4 6.644 0±0.224 0 6.727 3±0.195 2 6.632 5±0.279 3 6.582 5±0.251 8
Average Precision(↑)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.512 7±0.063 7 0.594 9±0.081 7 0.605 3±0.065 3 0.597 9±0.067 6 0.591 7±0.056 7 0.590 8±0.067 6
Cal500 0.487 5±0.012 5 0.328 3±0.010 4 0.490 4±0.012 1 0.488 6±0.013 0 0.487 9±0.010 6 0.489 0±0.013 9
Emotions 0.790 1±0.022 6 0.897 8±0.027 4 0.748 6±0.028 5 0.728 9±0.035 0 0.735 7±0.023 0 0.741 5±0.026 3
Flags 0.805 1±0.032 3 0.713 3±0.067 3 0.818 4±0.025 3 0.820 8±0.023 8 0.826 3±0.045 8 0.826 3±0.045 8
Scene 0.831 6±0.014 8 0.814 1±0.025 6 0.706 8±0.018 7 0.718 6±0.025 7 0.724 7±0.022 9 0.728 4±0.025 9
Yeast 0.739 3±0.021 7 0.772 7±0.017 8 0.727 0±0.020 9 0.719 8±0.021 8 0.733 9±0.021 3 0.735 6±0.022 7

Table 12

Performance comparison under BRkNN classifier"

Hamming Loss(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.053 2±0.006 1 0.035 9±0.008 3 0.049 4±0.005 8 0.050 1±0.005 7 0.049 8±0.005 8 0.049 4±0.006 6
Cal500 0.140 1±0.002 7 0.100 4±0.002 5 0.140 2±0.002 4 0.142 2±0.003 4 0.141 5±0.001 8 0.142 0±0.003 3
Emotions 0.198 2±0.016 8 0.130 0±0.021 0 0.223 1±0.022 6 0.227 7±0.021 5 0.238 0±0.014 6 0.231 6±0.016 6
Flags 0.273 1±0.027 2 0.208 5±0.048 3 0.260 6±0.030 4 0.267 0±0.024 2 0.254 8±0.027 0 0.254 8±0.027 0
Scene 0.109 5±0.005 1 0.123 5±0.007 6 0.155 7±0.007 5 0.149 1±0.008 4 0.151 3±0.006 8 0.150 2±0.009 2
Yeast 0.210 7±0.011 7 0.151 2±0.009 7 0.218 4±0.010 1 0.219 9±0.009 7 0.217 9±0.011 2 0.215 9±0.009 6
Ranking Loss(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.151 7±0.023 5 0.070 8±0.016 1 0.099 7±0.022 8 0.099 3±0.018 8 0.100 8±0.020 2 0.099 9±0.022 0
Cal500 0.201 1±0.007 9 0.237 3±0.013 1 0.212 5±0.007 7 0.215 7±0.008 9 0.220 2±0.008 6 0.214 9±0.008 6
Emotions 0.172 1±0.033 1 0.057 7±0.017 6 0.202 5±0.024 8 0.219 6±0.023 7 0.225 2±0.024 2 0.225 0±0.022 5
Flags 0.210 4±0.032 9 0.228 0±0.073 3 0.190 2±0.037 7 0.191 0±0.032 0 0.184 3±0.024 8 0.184 3±0.024 8
Scene 0.113 7±0.012 1 0.092 6±0.017 9 0.191 3±0.013 1 0.186 7±0.012 6 0.183 9±0.011 4 0.181 4±0.017 1
Yeast 0.197 8±0.016 4 0.145 6±0.012 1 0.201 1±0.014 2 0.214 1±0.013 1 0.199 0±0.015 2 0.194 1±0.011 3
One Error(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.815 5±0.034 9 0.823 2±0.054 0 0.707 0±0.048 9 0.696 3±0.052 2 0.717 9±0.045 6 0.702 3±0.047 1
Cal500 0.139 3±0.066 2 0.684 9±0.061 1 0.157 1±0.079 3 0.159 3±0.067 2 0.173 2±0.073 3 0.157 1±0.067 8
Emotions 0.293 6±0.055 1 0.441 9±0.051 4 0.335 6±0.047 9 0.362 4±0.048 6 0.377 7±0.041 7 0.364 4±0.056 6
Flags 0.237 6±0.054 3 0.628 9±0.103 3 0.200 8±0.061 7 0.180 5±0.047 9 0.180 3±0.057 0 0.180 3±0.057 0
Scene 0.300 8±0.026 2 0.644 8±0.026 8 0.477 3±0.044 1 0.457 8±0.036 1 0.454 5±0.036 7 0.450 3±0.040 2
Yeast 0.256 1±0.026 9 0.263 5±0.027 2 0.268 1±0.025 8 0.258 6±0.024 9 0.246 2±0.029 5 0.254 9±0.022 6
Coverage(↓)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 3.907 8±0.458 0 1.074 6±0.231 7 2.793 8±0.475 7 2.820 8±0.405 1 2.851 1±0.374 3 2.810 7±0.409 3
Cal500 138.659 1±1.945 6 86.374 3±3.427 6 144.799 6±2.470 5 144.950 9±3.038 8 147.130 3±2.659 7 144.871 8±2.685 4
Emotions 1.845 6±0.186 7 0.332 7±0.085 1 1.968 1±0.158 2 2.052 4±0.117 0 2.048 7±0.127 1 2.089 5±0.137 0
Flags 3.776 8±0.425 0 0.977 4±0.292 8 3.639 5±0.424 7 3.679 7±0.290 6 3.616 6±0.318 7 3.616 6±0.318 7
Scene 0.656 0±0.054 4 0.302 4±0.059 6 1.039 0±0.048 7 1.017 4±0.050 1 1.006 6±0.050 6 0.991 7±0.082 9
Yeast 6.837 8±0.254 5 2.553 9±0.162 1 6.917 5±0.223 7 7.126 1±0.182 0 6.839 5±0.212 6 6.782 4±0.202 4
Average Precision(↑)
Dataset Original CMSVP IG-BR IG-LP RF-BR RF-LP
Birds 0.455 7±0.050 5 0.595 9±0.080 8 0.607 6±0.058 4 0.615 3±0.058 6 0.598 1±0.056 9 0.605 9±0.061 9
Cal500 0.476 7±0.011 8 0.306 8±0.014 0 0.475 5±0.014 1 0.467 0±0.014 2 0.466 6±0.011 6 0.466 3±0.014 3
Emotions 0.788 4±0.033 0 0.891 1±0.030 2 0.761 4±0.028 8 0.743 7±0.025 8 0.739 6±0.025 3 0.741 6±0.024 1
Flags 0.815 2±0.024 8 0.706 2±0.049 1 0.837 7±0.032 1 0.835 1±0.030 3 0.839 6±0.021 8 0.839 6±0.021 8
Scene 0.814 7±0.015 6 0.800 5±0.029 0 0.704 3±0.024 4 0.714 6±0.020 6 0.716 8±0.020 7 0.719 1±0.025 3
Yeast 0.732 2±0.019 8 0.766 9±0.017 3 0.723 7±0.019 8 0.713 8±0.0195 0.730 7±0.019 6 0.733 9±0.016 6
1 HVLLERMEIER E , FVRNKRANZ J , CHENG Weiwei , et al. Label ranking by learning pairwise preferences[J]. Artificial Intelligence, 2008, 172 (16/17): 1897- 1916.
2 FVRNKRANZ J , HVLLERMEIER E , MENCÍA EL , et al. Multilabel classification via calibrated label ranking[J]. Machine Learning, 2008, 73 (2): 133- 153.
doi: 10.1007/s10994-008-5064-8
3 WANG Jiang, YANG Yi, MAO Junhua, et al. Cnn-rnn: a unified framework for multi-label image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2016: 2285-2294.
4 WANG Mei, ZHOU Xiangdong, CHUA T S. Automatic image annotation via local multi-label classification[C]//Proceedings of the 2008 International Conference on Content-based Image and Video Retrieval. Niagara Falls: ACM, 2008: 17-26.
5 YU Ying , PEDRYCZ W , MIAO Duoqian . Neighborhood rough sets based multi-label classification for automatic image annotation[J]. International Journal of Approximate Reasoning, 2013, 54 (9): 1373- 1387.
doi: 10.1016/j.ijar.2013.06.003
6 CHEN Weizhu, YAN Jun, ZHANG Benyu, et al. Document transformation for multi-label feature selection in text categorization[C]//Seventh IEEE International Conference on Data Mining. Omaha, IEEE, 2007: 451-456.
7 SCHAPIRE R E , SINGER Y . BoosTexter: a boosting-based system for text categorization[J]. Machine Learning, 2000, 39 (2/3): 135- 168.
doi: 10.1023/A:1007649029923
8 UEDA N, SAITO K. Parametric mixture models for multi-labeled text[C]//Advances in Neural Information Processing Systems. British Columbia: MIT Press, 2003: 737-744.
9 张志飞, 苗夺谦. 基于粗糙集的文本分类特征选择算法[J]. 智能系统学报, 2009, 4 (5): 453- 457.
ZHANG Zhifei , MIAO Duoqian . Feature selection for text categorization based on rough set[J]. CAAI Transactions on Intelligent Systems, 2009, 4 (5): 453- 457.
10 CLARE A , KING R D . Knowledge discovery in multi-label phenotype data[J]. Lecture Notes in Computer Science, 2001, 2168 (2168): 42- 53.
11 高娟, 王国胤, 胡峰. 多类别肿瘤基因表达谱的自动特征选择方法[J]. 计算机科学, 2012, 39 (10): 193- 197.
doi: 10.3969/j.issn.1002-137X.2012.10.043
GAO Juan , WANG Guoyin , HU Feng . Auto-selection of informative gene for multi-class tumor gene expression profiles[J]. Computer Science, 2012, 39 (10): 193- 197.
doi: 10.3969/j.issn.1002-137X.2012.10.043
12 SAEYS Y . A review of feature selection techniques in bioinformatics[J]. Bioinformatics, 2007, 23 (19): 2507- 2517.
doi: 10.1093/bioinformatics/btm344
13 MIN Fan , LIU Qihe , FANG Chunlan . Rough sets approach to symbolic value partition[J]. International Journal of Approximate Reasoning, 2008, 49 (3): 689- 700.
doi: 10.1016/j.ijar.2008.07.002
14 秦奇伟, 梁吉业, 钱宇华. 一种基于邻域距离的聚类特征选择方法[J]. 计算机科学, 2012, 39 (1): 175- 177.
doi: 10.3969/j.issn.1002-137X.2012.01.040
QIN Qiwei , LIANG Jiye , QIAN Yuhua . Clustering feature selection method based on neighborhood distance[J]. Computer Science, 2012, 39 (1): 175- 177.
doi: 10.3969/j.issn.1002-137X.2012.01.040
15 段洁, 胡清华, 张灵均, 等. 基于邻域粗糙集的多标记分类特征选择算法[J]. 计算机研究与发展, 2015, 52 (1): 56- 65.
DUAN Jie , HU Qinghua , ZHANG Lingjun , et al. Feature selection for multi-label classification based on neighborhood rough sets[J]. Journal of Computer Research and Development, 2015, 52 (1): 56- 65.
16 严莉莉, 张燕平. 基于类信息的文本聚类中特征选择算法[J]. 计算机工程与应用, 2007, 43 (12): 144- 146.
doi: 10.3321/j.issn:1002-8331.2007.12.046
YAN Lili , ZHANG Yanping . A class-based feature selection algorithm for test clustering[J]. Computer Engineering and Applications, 2007, 43 (12): 144- 146.
doi: 10.3321/j.issn:1002-8331.2007.12.046
17 ROKACH L , SCHCLAR A , ITACH E . Ensemble methods for multi-label classification[J]. Expert Systems with Applications, 2014, 41 (16): 7507- 7523.
doi: 10.1016/j.eswa.2014.06.015
18 SPOLAÔR N , CHERMAN E A , MONARD M C , et al. A comparison of multi-label feature selection methods using the problem transformation approach[J]. Electronic Notes in Theoretical Computer Science, 2013, 292: 135- 151.
doi: 10.1016/j.entcs.2013.02.010
19 ZHANG Minling , PEÑA J M , ROBLES V . Feature selection for multi-label naive Bayes classification[J]. Information Sciences, 2009, 179 (19): 3218- 3229.
doi: 10.1016/j.ins.2009.06.010
20 CAI Zhiling , ZHU William . Feature selection for multi-label classification using neighborhood preservation[J]. IEEE/CAA Journal of Automatica Sinica, 2018, 5 (1): 320- 330.
doi: 10.1109/JAS.2017.7510781
21 KERBER R.ChiMerge: discretization of numeric attributes[C]// Proceedings of the 10th National Conference on Artificial Intelligence. San Jose: AAAI, 1992: 12-16.
22 WEN Liuying , MIN Fan , WANG Shiyuan . A two-stage discretization algorithm based on information entropy[J]. Applied Intelligence, 2017, 47 (4): 1169- 1185.
23 NGUYEN H S. Discretization of real value attributes, boolean reasoning approach[D]. Warsaw: Warsaw University, 1997.
24 WEN Liuying , MIN Fan . A granular computing approach to symbolic value partitioning[J]. Fundamenta Informaticae, 2015, 142 (1/2/3/4): 337- 371.
25 HAREL D, KOREN Y. On clustering using random walks[C]//International Conference on Foundations of Software Technology and Theoretical Computer Science. Heidelberg: Springer, 2001: 18-41.
26 MIN Fan , HU Qinghua , ZHU William . Feature selection with test cost constraint[J]. International Journal of Approximate Reasoning, 2014, 55 (1): 167- 179.
[1] . [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(3): 81-88.
[2] TANG Yi-ming, ZHANG Zheng, LU Qi-ming. Gaussian kernel fuzzy C-means clustering driven by piecewise quadratic transfer function [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(3): 107-112.
[3] Zheng-yu LU,Guang-song LI,Ying-zhu SHEN,Bin ZHANG. Unknown protocol message clustering algorithm based on continuous features [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(5): 37-43.
[4] CUI Zhao-yang, SUN Jia-qi, XU Song-yan, JIANG Xin. A secure clustering algorithm of Ad Hoc network for colony UAVs [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(7): 51-59.
[5] CHEN Xin, XUE Yun, LU Xin, LI Wan-li, ZHAO Hong-ya, HU Xiao-hui. Text feature extraction method for sentiment analysis based on order-preserving submatrix and frequent sequential pattern mining [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 36-45.
[6] HUANG Dong, XU Bo, XU Kan, LIN Hong-fei, YANG Zhi-hao. Short text clustering based on word embeddings and EMD [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 66-72.
[7] XU Zhong-hao, LI Tian-qi. Analysison statistical characteristic of Chinese stock market based on complex networks [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(5): 41-48.
[8] . Construction of expert relationship network based on random walk strategy [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(7): 30-34.
[9] ZHAI Peng, LI Deng-dao. The fuzzy clustering algorithm based on inclusion index of Gausian membership function [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(5): 102-105.
[10] LIU Ying-ying, LIU Pei-yu, WANG Zhi-hao, LI Qing-qing, ZHU Zhen-fang. A text clustering algorithm based on find of density peaks [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(1): 65-70.
[11] FEI Shi-long, BAI Yao-qian. A class of random walks in a random environment with singular jumps [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(11): 119-126.
[12] FAN Yi-xing, GUO Yan, LI Xi-peng, ZHAO Ling, LIU Yue, YU Xiao-ming, CHENG Xue-qi. A multi-level page clustering method based on page segmentation [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(07): 1-8.
[13] ZHU Rui. E-commerce community clustering model based on trust [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(05): 18-22.
[14] JIAO Lu-lin, PENG Yan, LIN Yun. Comparative research on text knowledge discovery for network public opinion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(09): 62-68.
[15] ZHANG Cong, YU Hong. An incremental three-way decisions soft clustering algorithm [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(08): 40-47.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] REN Hui-xue,YANG Yan-zhao,LIN Ji-mao,QI Yin-shan,ZHANG Ye-qing . Synthesis and characterization of 5-bromo-3-sec-butyl-6-methyluracil[J]. J4, 2007, 42(7): 9 -12 .
[2] LIU Ru-jun,CAO Yu-xia,ZHOU Ping . Anti-control for discrete chaos systems by small feedback[J]. J4, 2007, 42(7): 30 -32 .
[3] PENG Zhen-hua, XU Yi-hong*, TU Xiang-qiu. Optimality conditions for weakly efficient elements of nearly preinvex set-valued optimizaton#br#[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(05): 41 -44 .
[4] HU Ming-Di, SHE Yan-Hong, WANG Min. Topological properties of  three-valued   logic  metric space[J]. J4, 2010, 45(6): 86 -90 .
[5] LI Jiao. Existence and uniqueness results for Caputo fractional differential  equations with initial value conditions[J]. J4, 2013, 48(4): 60 -64 .
[6] ZHAO Tong-xin1, LIU Lin-de1*, ZHANG Li1, PAN Cheng-chen2, JIA Xing-jun1. Pollinators and pollen polymorphism of  Wisteria sinensis (Sims) Sweet[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 1 -5 .
[7] WANG Kai-rong, GAO Pei-ting. Two mixed conjugate gradient methods based on DY[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(6): 16 -23 .
[8] LI Shou-ju1,SHANGGUAN Zi-chang2,3,SUN Wei4,LUAN Mao-tian1,LIU Bo3. Parameter  inversion  procedure  for  a  nonlinear constitutive  model  of  conditioned  soils[J]. J4, 2010, 45(7): 24 -27 .
[9] ZHANG Dong-qing, YIN Xiao-bin, GAO Han-peng. Quasi-linearly Armendariz modules[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(12): 1 -6 .
[10] CHEN Li, . Singular LQ suboptimal control problem with disturbance rejection[J]. J4, 2006, 41(2): 74 -77 .