《山东大学学报(理学版)》 ›› 2023, Vol. 58 ›› Issue (5): 36-45.doi: 10.6040/j.issn.1671-9352.0.2021.790
• • 上一篇
孟金旭1,单鸿涛1*,黄润才1,闫丰亭3,李志伟1,郑光远2,刘一鸣1,石昌通1
MENG Jinxu1, SHAN Hongtao1*, HUANG Runcai1, YAN Fengting3, LI Zhiwei1, ZHENG Guangyuan2, LIU Yiming1, SHI Changtong1
摘要: 提出了基于XLNet的双通道特征融合文本分类(XLNet-CNN-BiGRU, XLCBG)模型。相对于单模型通道,XLCBG模型通过融合XLNet+CNN和XLNet+BiGRU这2个通道的特征信息,能提取更加丰富的语义特征。XLCBG模型对融合后的特征信息分别采用了Maxpooling、Avgpooling和注意力机制等处理方式,分别提取全局中特征值最大的向量、全局中的均值特征向量、注意力机制的关键特征来代替整个向量,从而使融合特征处理的方式多样化,使最优分类模型的可选择性增多。最后,将当前流行的文本分类模型与XLCBG模型进行了比较实验。实验结果表明:XLCBG-S模型在中文THUCNews数据集上分类性能优于其他模型;XLCBG-Ap模型在英文AG News数据集上分类性能优于其他模型;在英文20NewsGroups数据集上,XLCBG-Att模型在准确率、召回率指标上均优于其他模型,XLCBG-Mp模型在精准率、F1指标上均优于其他模型。
中图分类号:
[1] 汪岿,刘柏嵩.文本分类研究综述[J].数据通信,2019(3):37-47. WANG Kui, LIU Baisong. A summary of text classification research[J]. Data Communications, 2019(3):37-47. [2] MANEK A S, SHENOY P D, MOHAN M C, et al. Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier[J]. World Wide Web, 2017, 20(2):135-154. [3] TANHA J, VAN SOMEREN M, AFSARMANESH H. Semi-supervised self-training for decision tree classifiers[J]. International Journal of Machine Learning and Cybernetics, 2017, 8(1):355-370. [4] TANG B, KAY S, HE H. Toward optimal feature selection in naive Bayes for text categorization[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(9):2508-2521. [5] KIM Y. Convolutional neural networks for sentence classification[C] //Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP). Doha: Association for Computational Linguistics, 2014: 1746-1751. [6] LIU P F, QIU X P, HUANG X J. Recurrent neural network for text classification with multi-task learning[C] //Proceedings of the Twenty-fifth International Joint Conference on Artificial Intelligence(IJCAI-16). New York: AAAI Press, 2016. [7] 陈虹,杨燕,杜圣东. 用户评论方面级情感分析研究[J]. 计算机科学与探索, 2021, 15(3):478-485. CHEN Hong, YANG Yan, DU Shengdong. Research on aspect-level sentiment analysis of user reviews[J]. Journal of Frontiers of Computer Science & Technology, 2021, 15(3):478-485. [8] 陶亮,刘宝宁,梁玮.基于CNN-LSTM 混合模型的心律失常自动检测[J].山东大学学报(工学版),2021,51(3):30-36. TAO Liang, LIU Baoning, LIANG Wei. Automatic detection research of arrhythmia based on CNN-LSTM hybrid model[J]. Journal of Shandong University(Engineering Science), 2021, 51(3):30-36. [9] 吴汉瑜, 严江, 黄少滨,等.用于文本分类的CNN_BiLSTM_Attention混合模型[J].计算机科学,2020,47(S2):23-27. WU Hanyu, YAN Jiang, HUANG Shaobin, et al. CNN_BiLSTM_Attention hybrid model for text classification[J]. Computer Science, 2020, 47(S2):23-27. [10] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J/OL]. arXiv, 2013. https://arxiv.org/pdf/1301.3781.pdf. [11] DEVLIN J, CHANG M W, LEE K, et al. Bert: pre-training of deep bidirectional transformers for language understanding[J/OL]. arXiv, 2018. https://arxiv.org/abs/1810.04805. [12] 陈德光,马金林,马自萍,等.自然语言处理预训练技术综述[J].计算机科学与探索,2015,15(8):1359-1388. CHEN Deguang, MA Jinlin, MA Ziping, et al. Review of pre-training techniques for natural language processing[J]. Journal of Frontiers of Computer Science and Technology, 2015, 15(8):1359-1388. [13] 董彦如,刘培玉,刘文锋,等.基于双向长短期记忆网络和标签嵌入的文本分类模型[J].山东大学学报(理学版),2020,55(11):78-86. DONG Yanru, LIU Peiyu, LIU Wenfeng, et al. A text classification model based on BiLSTM and label embedding[J]. Journal of Shandong University(Natural Science), 2020, 55(11):78-86. [14] YANG Z, DAI Z, YANG Y, et al. XLNet: generalized autoregressive pretraining for language understanding[J/OL]. arXiv, 2019. https://arxiv.org/abs/1906.08237. [15] LAI S, XU L, LIU K, et al. Recurrent convolutional neural networks for text classification[C] //Twenty-ninth AAAI Conference on Artificial Intelligence. Austin: AAAI Press, 2015: 2267-2273. [16] JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C] //Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouve: Association for Computational Linguistics, 2017: 562-570. [17] 郑诚,陈杰,董春阳.结合图卷积的深层神经网络用于文本分类[J].计算机工程与应用,2022,58(7):206-212. ZHENG Cheng, CHEN Jie, DONG Chunyang. Deep neural network combined with graph convolution for text classification[J].Computer Engineering and Applications, 2022, 58(7):206-212. [18] 闫跃,霍其润,李天昊,等.融合多重注意力机制的卷积神经网络文本分类设计与实现[J].小型微型计算机系统,2021,42(2):362-367. YAN Yue, HUO Qirun, LI Tianhao, et al. Design and implementation of text classification based on convolutional neural network with multiple attention mechanisms[J]. Journal of Chinese Computer Systems, 2021, 42(2):362-367. [19] 李启行,廖薇,孟静雯.基于注意力机制的双通道DAC-RNN文本分类模型[J].计算机工程与应用,2022,58(16):157-163. LI Qihang, LIAO Wei, MENG Jingwen. Dual-channel DAC-RNN text categorization model based on attention mechanism[J]. Computer Engineering and Applications, 2022, 58(16):157-163. |
[1] | 郑承宇,王新,王婷,邓亚萍,尹甜甜. 基于ALBERT-TextCNN模型的多标签医疗文本分类方法[J]. 《山东大学学报(理学版)》, 2022, 57(4): 21-29. |
[2] | 张斌艳,朱小飞,肖朝晖,黄贤英,吴洁. 基于半监督图神经网络的短文本分类[J]. 《山东大学学报(理学版)》, 2021, 56(5): 57-65. |
[3] | 董彦如,刘培玉,刘文锋,赵红艳. 基于双向长短期记忆网络和标签嵌入的文本分类模型[J]. 《山东大学学报(理学版)》, 2020, 55(11): 78-86. |
[4] | 谢小杰,梁英,董祥祥. 社交网络用户敏感属性迭代识别方法[J]. 《山东大学学报(理学版)》, 2019, 54(3): 10-17, 27. |
[5] | 严倩,王礼敏,李寿山,周国栋. 结合新闻和评论文本的读者情绪分类方法[J]. 山东大学学报(理学版), 2018, 53(9): 35-39. |
[6] | 孙建东,顾秀森,李彦,徐蔚然. 基于COAE2016数据集的中文实体关系抽取算法研究[J]. 山东大学学报(理学版), 2017, 52(9): 7-12. |
[7] | 杨艳,徐冰,杨沐昀,赵晶晶. 一种基于联合深度学习模型的情感分类方法[J]. 山东大学学报(理学版), 2017, 52(9): 19-25. |
[8] | 万中英,王明文,左家莉,万剑怡. 结合全局和局部信息的特征选择算法[J]. 山东大学学报(理学版), 2016, 51(5): 87-93. |
[9] | 马成龙, 姜亚松, 李艳玲, 张艳, 颜永红. 基于词矢量相似度的短文本分类[J]. 山东大学学报(理学版), 2014, 49(12): 18-22. |
[10] | 郑妍, 庞琳, 毕慧, 刘玮, 程工. 基于情感主题模型的特征选择方法[J]. 山东大学学报(理学版), 2014, 49(11): 74-81. |
[11] | 刘伍颖,易绵竹,张兴. 一种时空高效的多类别文本分类算法[J]. J4, 2013, 48(11): 99-104. |
[12] | 蒋盛益1,庞观松2,张建军3. 基于聚类的垃圾邮件识别技术研究[J]. J4, 2011, 46(5): 71-76. |
[13] | 黄贤立,罗冬梅. 倾向性文本迁移学习中的特征重要性研究[J]. J4, 2010, 45(7): 13-17. |
[14] | 袁晓航,杜小勇 . iRIPPER——一种改进的基于规则学习的文本分类算法[J]. J4, 2007, 42(11): 66-68 . |
[15] | 张华伟,王明文,甘丽新 . 基于随机森林的文本分类模型研究[J]. J4, 2006, 41(3): 139-143 . |
|