您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2014, Vol. 49 ›› Issue (09): 109-114.doi: 10.6040/j.issn.1671-9352.2.2014.106

• 论文 • 上一篇    下一篇

一种基于相似度的高效网络流量识别方案

杜瑞颖1, 杨勇2, 陈晶1, 王持恒1   

  1. 1. 武汉大学计算机学院, 湖北 武汉 430072;
    2. 武汉大学国际软件学院, 湖北 武汉 430072
  • 收稿日期:2014-06-24 修回日期:2014-08-27 出版日期:2014-09-20 发布日期:2014-09-30
  • 作者简介:杜瑞颖(1964-),女,教授,博士,主要研究领域为网络安全.E-mail:duraying@126.com
  • 基金资助:
    国家自然科学基金资助项目(61272451,61173154)

An efficient network traffic classification scheme based on similarity

DU Rui-ying1, YANG Yong2, CHEN Jing1, WANG Chi-heng1   

  1. 1. School of Computer, Wuhan University, Wuhan 430072, Hubei, China;
    2. International School of Software, Wuhan University, Wuhan 430072, Hubei, China
  • Received:2014-06-24 Revised:2014-08-27 Online:2014-09-20 Published:2014-09-30

摘要: 支持向量机(support vector machine,SVM)是分类算法中集高效性、准确率和实时性于一体的分类方案。但由于在SVM分类决策的过程中,无关的分类器也参与了投票,使得方案的实时性和分类可靠性有一定程度的降低。提出了基于相似度的高效SVM网络流量识别方案(efficient SVM based on similarity,ESVMS)。ESVMS通过估算待分类实例可能所属的类别范围,排除SVM中那些无关分类器的投票决策。实验结果表明ESVMS较SVM分类准确度几乎没有降低,但分类实时性进一步提高。

关键词: 网络流量识别, 机器学习, 支持向量机

Abstract: Support Vector Machine is a classification algorithm that combines high efficiency, high accuracy and real time. There's a problem when SVM makes its decision for an un-labeled instance because uninvolved classifiersparticipate in that affects SVM's real time performance and reliability. Thus, a method utilized Efficient SVM based on Similarity (ESVMS) for traffic classification was proposed. ESVMS estimates the classes that an un-labeled instances may belongs to as to kick out the uninvolved classifiers. Experimental results show that ESVMS holds the accuracy of SVM's and improves its real time performance.

Key words: machine learning, support vector machine, network traffic classification

中图分类号: 

  • TP393
[1] LI Xiang, QI Feng, XU Dan, et al. An internet traffic classification method based on semi-supervised support vector machine[C]//Proceedings of 2011 IEEE International Conference on Communications (ICC). Washington:IEEE Computer Society, 2011:1-5.
[2] LIU Tingwen, SUN Yong, GUO Li. Fast and memory-efficient traffic classification with deep packet inspection in CMP architecture[C]//Proceedings of 2010 IEEE 5th International Conference on Networking, Architecture and Storage (NAS). Washington:IEEE Computer Society, 2010:208-217.
[3] KIM H, CLAFFY K C, FOMENKOV M, et al. Internet traffic classification demystified:myths, caveats, and the best practices[C]//Proceedings of 2008 ACM CoNEXT Conference. New York:ACM Press,2008:11-15.
[4] JIN Yu, DUFFIELD N, ERMAN J, et al. A modular machine learning system for Flow-Level traffic classification in large networks[J]. ACM Transactions on Knowledge Discovery From Data, 2012, 6(1):4-10.
[5] BUJLOW T, RIAZ T, PEDERSEN J M. A method for classification of network traffic based on C5.0 machine learning algorithm[C]//Proceedings of 2012 International Conference on Computing, Networking and Communications (ICNC). Washington:IEEE Computer Society, 2012:237-241.
[6] MOORE A W, ZUEV D. Internet traffic classification using bayesian analysis techniques[J]. ACM SIGMETRICS Performance Evaluation Review, 2005, 33(6):50-60.
[7] MCGREGOR A, HALL M P, LORIER P, et al. Flow clustering using machine learning techniques:passive and active network measurement [J]. Lecture Notes in Computer Science, 2004, 3015:205-214.
[8] MOON T K. The expectation-maximization algorithm[J]. IEEE Signal Processing Magazine,1996, 13(6):47-60.
[9] ERMAN J, ARLITT M, MAHANTI A. Traffic classification using clustering algorithms[C]//Proceedings of 2006 SIGCOMM Workshop on Mining Network Data. New York:ACM Pres, 2006:281-286.
[10] KRYSZKIEWICZ M, LASEK P. TI-DBSCAN:clustering with DBSCAN by means of the triangle inequality[J]. Lecture Notes in Computer Science, 2010, 6086:60-69.
[11] ZANDER S, NGUYEN T, ARMITAGE G. Automated traffic classification and application identification using machine learning[C]//Proceedings of IEEE Conference on Local Computer Networks. Washington:IEEE Computer Society, 2005:250-257.
[12] ZHANG Jun, XIANG Yang, WANG Yu, et al. Network traffic classification using correlation information[J]. IEEE Transactions on Parallel and Distributed Systems, 2013, 24(1):104-117.
[13] SU Mingyang. Using clustering to improve the KNN-based classifiers for online anomaly network traffic identification[J]. Journal of Network and Computer Applications, 2011, 34(2):722-730.
[14] JING Ning,YANG Ming,CHENG Shaoyin, et al. An efficient SVM-based method for multi-class network traffic classification[C]//Proceedings of 2011 IEEE International on Performance Computing and Communications Conference (IPCCC). Washington:IEEE Computer Society, 2011:1-8.
[15] CHUNG J Y, PARK B, WON Y J, et al. An effective similarity metric for application traffic classification[C]//Proceedings of 2010 IEEE-IFIP Network Operations and Management Symposium. Washington:IEEE Computer Society, 2010:286-292.
[16] BENSON T, AKELLA A, MALTZ D A. Network traffic characteristics of data centers in the wild[C]//Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. New York:ACM Press, 2010:267-280.
[17] SHAO Xiufeng, CHENG Wei. Improved CURE algorithm and application of clustering for large-scale data[C]//Proceedings of 2011 International Symposium on Information Technology in Medicine and Education (ITME 2011). Piscataway:IEEE Press, 2011:305-308.
[18] BOUCKAERT R R, FRANK E, HALL M A, et al. WEKA-experiences with a Java open-source project [J]. The Journal of Machine Learning Research, 2010, 11(9):2533-2541.
[1] 彭秋芳,刘洋. 基于SVM的电子商务行为的性别判断[J]. 山东大学学报(理学版), 2016, 51(7): 74-80.
[2] 苏丰龙,谢庆华,黄清泉,邱继远,岳振军. 基于直推式学习的半监督属性抽取[J]. 山东大学学报(理学版), 2016, 51(3): 111-115.
[3] 杜红乐,张燕,张林. 不均衡数据集下的入侵检测[J]. 山东大学学报(理学版), 2016, 51(11): 50-57.
[4] 刘铭, 昝红英, 原慧斌. 基于SVM与RNN的文本情感关键句判定与抽取[J]. 山东大学学报(理学版), 2014, 49(11): 68-73.
[5] 潘清清,周枫,余正涛,郭剑毅,线岩团. 基于条件随机场的越南语命名实体识别方法[J]. 山东大学学报(理学版), 2014, 49(1): 76-79.
[6] 董源1,徐雅斌1,2*,李卓1,2,李艳平1. 基于社会计算和机器学习的垃圾邮件识别方法的研究[J]. J4, 2013, 48(7): 72-78.
[7] 刘飚1,2,陈春萍3,封化民1,3,李洋3. 基于Fisher准则的SVM参数选择算法[J]. J4, 2012, 47(7): 50-54.
[8] 黄林晟1,邓志鸿1,2,唐世渭1,2,王文清3,陈凌3. 基于编辑距离的中文组织机构名简称-全称匹配算法[J]. J4, 2012, 47(5): 43-48.
[9] 曹林林1,2,张化祥1,2*,王至超1,2. 一种基于信息熵数据修剪的支持向量机:EB-SVM[J]. J4, 2012, 47(5): 59-62.
[10] 唐都钰1,王大亮2,赵凯2,秦兵1,刘挺1. 面向汽车领域的软文识别研究[J]. J4, 2012, 47(3): 43-46.
[11] 张宁仙,郭敏*,马苗. 基于AR模型和SVM的果蝇振翅声分类[J]. J4, 2011, 46(7): 83-86.
[12] 黄贤立,罗冬梅. 倾向性文本迁移学习中的特征重要性研究[J]. J4, 2010, 45(7): 13-17.
[13] 宋玉丹,王士同*. 基于特征缺省的最小类内方差支持向量机[J]. J4, 2010, 45(7): 102-107.
[14] 易超群,李建平,朱成文. 一种基于分类精度的特征选择支持向量机[J]. J4, 2010, 45(7): 119-121.
[15] 杨冰,王士同*. 基于公共矢量的总间隔v最小类内方差支持向量机在噪音人脸图像分类中的应用[J]. J4, 2010, 45(11): 5-11.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!