一种基于深度学习的快速DGA域名分类算法

doi:10.6040/j.issn.1671-9352.0.2019.249

Abstract

Abstract: A CNN-LSTM-Concat fast DGA domain classification algorithm based on deep learning is proposed. The multi-layer one-dimensional convolution networks are used to serialize domain name characters. The LSTM network layer is used to enhance the long-distance dependence between characters. By converting the multi-sequence input of LSTM into a single vector input, the training and detection speed can be greatly improved under the premise of ensuring the detection performance. Experiments show that our method has a precision of 98.32% for DGA domain classification using public datasets. At the same time, the detection time is 6.41 times faster than the LSTM method when the accuracy is higher than the epidemic LSTM methods.

Key words: DGA, CNN, LSTM

CLC Number:

TP391

LIU Yang, ZHAO Ke-jun, GE Lian-sheng, LIU Heng. A fast DGA domain detection algorithm based on deep learning[J].JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(7): 106-112.

References

[1] STONE-GROSS B, COVA M, GILBERT B, et al. Analysis of a botnet takeover[J]. IEEE Security & Privacy Magazine, 2011, 9(1):64-72.
[2] CHOI H, LEE H, LEE H, et al. Botnet detection by monitoring group activities in DNS traffic[C] // 7th IEEE International Conference on Computer and Information Technology(CIT 2007).[S.l.] :[s.n.] , 2007: 715-720.
[3] BILGE L, SEN S, BALZAROTTI D, et al. Exposure: a passive DNS analysis service to detect and report malicious domains[J]. ACM Trans Inf Syst Secur, 2014, 16(4):14:1-14:28.
[4] KWON J, LEE J, LEE H, et al. PsyBoG: a scalable botnet detection method for large-scale DNS traffic[J]. Computer Networks, 2016, 97:48-73.
[5] YADAV S, REDDY A L N. Winning with DNS failures: strategies for faster botnet detection[C] // Security and Privacy in Communication Networks. Berlin: Springer, 2011: 446-459.
[6] YADAV S, REDDY A K K, REDDY A L N, et al. Detecting algorithmically generated malicious domain names[C] // Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. New York: ACM, 2010: 48-61.
[7] SCHIAVONI S, MAGGI F, CAVALLARO L, et al. Phoenix: DGA-based botnet tracking and intelligence[C] // Detection of Intrusions and Malware, and Vulnerability Assessment. Cham: Springer, 2014: 192-211.
[8] 张维维, 龚俭, 刘茜等. 基于词素特征的轻量级域名检测算法[J]. 软件学报, 2016, 27(9):2348-2364. ZHANG Weiwei, GONG Jian, LIU Qian, et al. Lightweight domain name detection algorithm based on morpheme features[J]. Journal of Software, 2016, 27(9):2348-2364.
[9] TRUONG D-T, CHENG G. Detecting domain-flux botnet based on DNS traffic features in managed network[J]. Security and Communication Networks, 2016, 9(14):2338-2347.
[10] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436-444.
[11] WOODBRIDGE J, ANDERSON H S, AHUJA A, et al. Predicting domain generation algorithms with long short-term memory networks[J/OL]. arXiv: 1611.00791 [cs] , 2016.
[12] HUANG J, WANG P, ZANG T, et al. Detecting domain generation algorithms with convolutional neural language models[C] // 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering(TrustCom/BigDataSE). [S.l.] :[s.n.] , 2018: 1360-1367.
[13] ZHAUNIAROVICH Y, KHALIL I, YU T, et al. A survey on malicious domains detection through DNS data analysis[J]. ACM Comput Surv, 2018, 51(4):67:1-67:36.
[14] YANG L, LIU G, ZHAI J, et al. A novel detection method for word-based DGA[C] // SUN X, PAN Z, BERTINO E. Cloud Computing and Security. [S.l.] : Springer International Publishing, 2018: 472-483.
[15] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[16] KIM Y. Convolutional neural networks for sentence classification[J/OL]. arXiv: 1408.5882 [cs] , 2014.
[17] KARIM F, MAJUMDAR S, DARABI H, et al. LSTM fully convolutional networks for time series classification[J]. IEEE Access, 2018, 6:1662-1669.
[18] KÜHRER M, ROSSOW C, HOLZ T. Paint it black: evaluating the effectiveness of malware blacklists[G] // STAVROU A, BOS H, PORTOKALIDIS G. Research in Attacks, Intrusions and Defenses. Cham: Springer International Publishing, 2014, 8688:1-21.
[19] LEE J, KWON J, SHIN H J, et al. Tracking multiple C&C botnets by analyzing DNS traffic[C] // 2010 6th IEEE Workshop on Secure Network Protocols. [S.l.] :[s.n.] , 2010: 67-72.
[20] 周昌令, 陈恺, 公绪晓等. 基于Passive DNS的速变域名检测[J]. 北京大学学报(自然科学版), 2016, 52(03):396-402. ZHOU Changling, CHEN Kai, GONG Xuxiao, et al. Detection of fast-flux domains based on passive DNS analysis[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2016, 52(03):396-402.
[21] GRILL M, NIKOLAEV I, VALEROS V, et al. Detecting DGA malware using NetFlow[C] // 2015 IFIP/IEEE International Symposium on Integrated Network Management(IM). [S.l.] :[s.n.] , 2015: 1304-1309.
[22] ANTONAKAKIS M, PERDISCI R, NADJI Y, et al. From throw-away traffic to bots: detecting the rise of DGA-based malware[C] // Proceedings of the 21st USENIX Conference on Security Symposium. Berkeley: USENIX Association, 2012: 24-24.
[23] YADAV S, REDDY A K K, REDDY A L N, et al. Detecting algorithmically generated domain-flux attacks with DNS traffic analysis[J]. IEEE/ACM Transactions on Networking, 2012, 20(5):1663-1677.
[24] TONG V, NGUYEN G. A method for detecting DGA botnet based on semantic and cluster analysis[C] // Proceedings of the Seventh Symposium on Information and Communication Technology-SoICT’16. Ho Chi Minh City, Viet Nam: ACM Press, 2016: 272-277.
[25] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[26] CURTIN R R, GARDNER A B, GRZONKOWSKI S, et al. Detecting DGA domains with recurrent neural networks and side information[J/OL]. [2018-10-04].https://arxiv.org/abs/1810.02023v1
[27] KOH J J, RHODES B. Inline detection of domain generation algorithms with context-sensitive word embeddings[C] // 2018 IEEE International Conference on Big Data(Big Data). [S.l.] :[s.n.] , 2018: 2966-2971.
[28] TRAN D, MAC H, TONG V, et al. A LSTM based framework for handling multiclass imbalance in DGA botnet detection[J]. Neurocomputing, 2018, 275:2401-2413.
[29] ZHANG X, ZHAO J, LECUN Y. Character-level convolutional networks for text classification[G] // CORTES C, LAWRENCE N D, LEE D D. Advances in Neural Information Processing Systems 28. [S.l.] :[s.n.] , 2015: 649-657.
[30] KINGMA D P, BA J. Adam: a method for stochastic optimization[J/OL]. arXiv:1412.6980 [cs] , 2014.
[31] OSINT feeds from bambenek consulting[EB/OL]. https://scikit-learn.org/stable/index.html
[32] Keras[EB/OL]. https://github.com/fchollet/keras
[33] OSINT feeds from bambenek consulting[EB/OL]. [2019-04-20]. http://osint.bambenekconsulting.com/feeds/.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

A fast DGA domain detection algorithm based on deep learning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 5

Metrics

Comments

Recommended 0

[1]	. Reader emotion classification with news and comments [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 35-39.
[2]	YANG Yan, XU Bing, YANG Mu-yun, ZHAO Jing-jing. An emotional classification method based on joint deep learning model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 19-25.
[3]	SUN Jian-dong, GU Xiu-sen, LI Yan, XU Wei-ran. Chinese entity relation extraction algorithms based on COAE2016 datasets [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 7-12.
[4]	CHEN Jing, LI Shou-shan, ZHOU Guo-dong. User age regression with dual-channel LSTM [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 91-96.
[5]	XU Guang-zhu1, LIU Ming2, REN Dong1, MA Yi-de3, LIU Xiao-li1. Multi-region image segmentation based on pulse coupled neural network [J]. J4, 2010, 45(7): 86-93.