JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2016, Vol. 51 ›› Issue (11): 50-57.doi: 10.6040/j.issn.1671-9352.2.2015.273

Previous Articles     Next Articles

Intrusion detection on imbalanced dataset

DU Hong-le, ZHANG Yan, ZHANG Lin   

  1. School of Mathematics and Computer Application, Shangluo University, Shangluo 726000, Shaanxi, China
  • Received:2015-09-21 Online:2016-11-20 Published:2016-11-22

Abstract: In transductive support vector machine, sample labeling error will result in error propagation in the iterative process. It affects the accuracy of sample labeling in the next iteration and makes mistakes constantly being accumulated. Eventually leading to classification hyperplane offset. Under imbalanced dataset, there is higher classification error rate of traditional SVM that causes the labeling error rate in each iterative for TSVM. Therefore, the algorithm of TSVM for imbalanced dataset is proposed in this paper. We dynamic calculates the penalty factor of every class according to the relationship of sample density of every class to improve the accuracy of labeling sample in each iterative. The algorithm inherits its rules of progressive labeling and dynamic adjusting, and reduces the offset of the classification hyperplane. Finally, experiment results with KDD CUP99 dataset show the algorithm can improve the classification performance at imbalanced dataset, especially for the minority class samples.

Key words: support vector machine, transductive learning, semi-supervised learning, imbalanced dataset, intrusion detection

CLC Number: 

  • TP301
[1] VAPNIK V N. Statistical learning theory[M]. New York: John Wiley and Sons, 1998.
[2] 陈毅松,汪国平,董士海.基于支持向量机的渐进直推式分类学习算法[J].软件学报,2003,14(3):451-460. CHEN Yisong, WANG Guoping, DONG Shihai. A progressive transductive inference algorithm based on support vector machine[J]. Journal of Software, 2003, 14(3):451-460.
[3] 王安娜,李云路,赵锋云, 等.一种新的半监督直推式支持向量机分类算法[J].仪器仪表学报,2011,32(7):1546-1550. WANG Anna, LI Yunlu, ZHAO Fengyun, et al. Novel semi-supervised classification algorithm based on TSVM[J]. Chinese Journal of Scientific Instrument, 2011, 32(7):1546-1550.
[4] 廖东平,姜斌,魏玺章, 等.一种快速的渐进直推式支持向量机分类学习算法[J].系统工程与电子技术,2007,29(1):87-91. LIAO Dongping, JIANG Bin, WEI Xizhang, et al. Fast learning algorithm with progressive transductive support vector machine[J]. Systems Engineering and Electronics, 2007, 29(1):87-91.
[5] 彭新俊,王翼飞.双模糊渐进直推式支持向量机算法[J].模式识别与人工智能,2009,22(4):560-566. PENG Xinjun, WANG Yifei. A bi-fuzzy progressive transductive support vector machine algorithm[J]. Pattern Recognition and Artificial Intelligence, 2009, 22(4):560-566.
[6] 薛贞霞,刘三阳,刘万里.改进的直推式支持向量机算法[J].系统工程理论与实践,2009,29(5):142-148. XUE Zhenxia, LIU Sanyang, LIU Wanli. Improved learning algorithm with transductive support vector machines[J]. Systems Engineering Theory and Practice, 2009, 29(5):142-148.
[7] 齐芳,冯昕,徐其江.基于人工鱼群优化的直推式支持向量机分类算法[J].计算机应用与软件, 2013,30(3):294-296. QI Fang, FENG Xin, XU Qijiang. Transductive support vector machine classification algorithm based on artificial fish school optimisation[J].Computer Applications and Software, 2013, 30(3):294-296.
[8] 丁要军,蔡皖东.采用两阶段策略模型(KTSVM)的P2P流量识别方法[J].西安交通大学学报, 2012,46(2):45-50,129. DING Yaojun, CAI Wandong. P2P traffic identification via k-means based transductive support vetor machine[J]. Journal of Xian Jiaotong University, 2012, 46(2):45-50,129.
[9] 艾解清,高济,彭艳斌,等.基于直推式支持向量机的协商决策模型[J].浙江大学学报(工学版),2012,46(6):967-973,994. AI Jieqing, GAO Ji, PENG Yanbin, et al. Negotiation decision model based on transductive support vector machine[J]. Journal of Zhejiang University(Engineering Science), 2012, 46(6):967-973, 994.
[10] 杜红乐.基于核空间中K-近邻的不均衡数据算法[J].计算机科学与探索, 2015,9(7):869-876. DU Hongle. Algorithm for imbalanced dataset based on K-nearest neighbor in kernel space[J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(7):869-876.
[11] 张建明,孙春梅,闫婷.基于自适应SVM的半监督主动学习视频标注[J].计算机工程,2013,39(8):190-195. ZHANG Jianming, SUN Chunmei, YAN Ting. Video annotation for semi-supervised active learning based on adaptive SVM[J]. Computer Engineering, 2013, 39(8):190-195.
[12] 金鑫,李玉鉴.不均衡支持向量机的惩罚因子选择方法[J].计算机工程与应用,2011,47(33):129-133. JIN Xin, LI Yujian. Error-cost selection for biased support vector machines[J]. Computer Engineering and Applications, 2011, 47(33):129-133.
[13] CHANG C C, LIN C J. LIBSVM: a library for support vector machines[J]. Acm Transactions on Intelligent Systems and Technology, 2011, 2(3):389-396.
[1] ZHANG Peng, WANG Su-ge, LI De-yu, WANG Jie. A semi-supervised spam review classification method based on heuristic rules [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 44-51.
[2] SU Feng-long, XIE Qing-hua, HUANG Qing-quan, QIU Ji-yuan, YUE Zhen-jun. Semi-supervised method for attribute extraction based on transductive learning [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(3): 111-115.
[3] DU Rui-ying, YANG Yong, CHEN Jing, WANG Chi-heng. An efficient network traffic classification scheme based on similarity [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(09): 109-114.
[4] CAO Lin-lin1,2, ZHANG Hua-xiang1,2*, WANG Zhi-chao1,2. EB-SVM: support vector machine based data pruning with informatior entropy [J]. J4, 2012, 47(5): 59-62.
[5] CHEN Pei-Jian1, YANG Yue-Xiang2, TANG Chuan2. Honesty-rate measuring based distributed intrusion detection system [J]. J4, 2011, 46(9): 77-80.
[6] JIANG Jia-tao, LIU Zhi-jie*, XIE Xiao-yao. Based on integrated fuzzy-neural network intrusion detection model [J]. J4, 2011, 46(9): 95-98.
[7] ZHANG Ning-xian, GUO Min*, MA Miao. Classification of fruit fly wings vibration sound based on the AR model and SVM [J]. J4, 2011, 46(7): 83-86.
[8] SONG Yu-dan, WANG Shi-tong*. Minimum within-class variance SVM with absent features [J]. J4, 2010, 45(7): 102-107.
[9] YI Chao-qun, LI Jian-ping, ZHU Cheng-wen. A kind of feature selection based on classification accuracy of SVM [J]. J4, 2010, 45(7): 119-121.
[10] YANG Bing, WANG Shi-tong*. Total margin v minimum class variance support vector machines  based on common  vectors for noisy face classification [J]. J4, 2010, 45(11): 5-11.
[11] LV Liang, YANG Bei, CHEN Zhen-Xiang. Research and design of a network security protection system [J]. J4, 2009, 44(9): 47-51.
[12] CAO Hong,DONG Shou-bin,ZHANG Ling . AA SVM multiclassifier based on the weighted threshold strategy [J]. J4, 2006, 41(3): 66-69 .
[13] IN Yu-ming,LI You . Chunk parsing for sentences based on SVM [J]. J4, 2006, 41(3): 112-115 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!