您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2024, Vol. 59 ›› Issue (7): 122-130.doi: 10.6040/j.issn.1671-9352.0.2023.291

• 综述 • 上一篇    



  1. 上海工程技术大学电子电气工程学院,上海 201620
  • 收稿日期:2023-06-30 出版日期:2024-07-20 发布日期:2024-07-15
  • 通讯作者: 廖薇 E-mail:2057775195@qq.com;liaowei54@126.com
  • 作者简介:黎超(1999—),男,硕士研究生,研究方向为自然语言处理、文本分类. E-mail: 2057775195@qq.com
  • 基金资助:

Chinese disease text classification model driven by medical knowledge

Chao LI(),Wei LIAO*()   

  1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2023-06-30 Online:2024-07-20 Published:2024-07-15
  • Contact: Wei LIAO E-mail:2057775195@qq.com;liaowei54@126.com



关键词: 疾病文本分类, 知识图谱, 卷积神经网络, 双向长短期记忆网络, 注意力机制


This study proposes a Chinese disease text classification model that integrates knowledge graph. Firstly, by introducing structured knowledge from external medical knowledge graph, a knowledge enhanced disease text vector representation is obtained; Secondly, the global semantic features and local semantic features of the disease text are extracted by using bidirectional long short-term memory network and convolutional neural network respectively. At the same time, the joint attention mechanism improves the efficiency of the model in extracting effective features information; Finally, the extracted features are concatenated and fused, and a classifier is used to output the classification result. The experimental results on the Chinese disease text dataset show that the proposed model has a classification accuracy, recall, and the harmonic mean value F1 of 95.21%, 95.64%, and 95.42%, respectively, which shows better classification performance compared to other models.

Key words: disease text classification, knowledge graph, CNN, BiLSTM, attention mechanism


  • TP391





疾病文本 科室
医生您好,乙肝表面抗原阴性,谷丙转氨酶169谷草转氨酶87正常吗? 肝病科
  我周围有很多认识的人得这种病,有的人把甲状腺切除了,有的人症状比较轻,但是人也变得消瘦了。引起这种病的原因是什么,治疗方法是什么? 内分泌科
前几天运动场有人跌倒后突然癫痫,全身发颤,请问癫痫是怎么造成的? 神经科







预测标签 Positive Negative
Positive TP FP
Negative FN TN







模型 P R F1
SVM 90.32 90.64 90.48
TextCNN 92.17 92.43 92.30
TextRNN 92.25 92.68 92.46
FastText 93.72 93.54 93.47
TextRCNN 93.47 93.85 93.66
DKCDM 95.21 95.64 95.42



模型 P R F1
Remove KG 93.62 93.96 93.79
Remove TransE 95.18 95.27 95.22
RemoveBiLSTM_Attention 94.76 94.12 94.44
Remove CNN_Attention 94.17 94.51 94.34
DKCDM 95.21 95.64 95.42
1 MA Y W, CHEN J L, SHIH W K. The survey for next generation mobile networks framework applied to intelligent Internet of medical[C]//2021 IEEE International Conference on Smart Internet of Things. Jeju: IEEE, 2021: 267-270.
2 LIYufei,SONGYuanyuan,ZHAOWei,et al.Exploring the role of online health community information in patients' decisions to switch from online to offline medical services[J].International Journal of Medical Informatics,2019,130,103951.
doi: 10.1016/j.ijmedinf.2019.08.011
3 YANGY F,ZHANGX F,LEEP K C.Improving the effectiveness of online healthcare platforms: an empirical study with multi-period patient-doctor consultation data[J].International Journal of Production Economics,2019,207,70-80.
doi: 10.1016/j.ijpe.2018.11.009
4 袁野,廖薇.基于双通道神经网络的疾病文本分类方法[J].中国医学物理学杂志,2021,38(5):655-660.
doi: 10.3969/j.issn.1005-202X.2021.05.025
YUANYe,LIAOWei.Disease text classification model based on two-channel neural network[J].Chinese Journal of Medical Physics,2021,38(5):655-660.
doi: 10.3969/j.issn.1005-202X.2021.05.025
5 MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. (2013-09-07)[2023-01-30]. http://arxiv.org/abs/1301.3781.
6 KIM Y. Convolutional neural networks for sentence classification[EB/OL]. (2014-09-03)[2023-01-30]. https://arxiv.org/abs/1408.5882.
7 LIU Pengfei, QIU Xipeng, HUANG Xuanjing. Recurrent neural network for text classification with multi-task learning[EB/OL]. (2016-05-17)[2023-01-30]. https://arxiv.org/abs/1605.05101.
8 ZHOU Peng, SHI Wei, TIAN Jun, et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: Association for Computational Linguistics, 2016: 207-212.
9 李启行,廖薇.基于注意力机制的生物医学文本分类模型[J].中国医学物理学杂志,2022,39(4):518-523.
doi: 10.3969/j.issn.1005-202X.2022.04.023
LIQixing,LIAOWei.Biomedical text classification model based on attention mechanism[J].Chinese Journal of Medical Physics,2022,39(4):518-523.
doi: 10.3969/j.issn.1005-202X.2022.04.023
10 邓维斌,朱坤,李云波,等.FMNN: 融合多神经网络的文本分类模型[J].计算机科学,2022,49(3):281-287.
DENGWeibin,ZHUKun,LIYunbo,et al.FMNN: text classification model fused with multiple neural networks[J].Computer Science,2022,49(3):281-287.
11 邓露,胡珀,李炫宏.知识增强的生物医学文本生成式摘要研究[J].数据分析与知识发现,2022,6(11):1-12.
doi: 10.11925/infotech.2096-3467.2022.0034
DENGLu,HUPo,LIXuanhong.Abstracting biomedical documents with knowledge enhancement[J].Data Analysis and Knowledge Discovery,2022,6(11):1-12.
doi: 10.11925/infotech.2096-3467.2022.0034
12 ZHOU Chengyang, GUAN Renchu, ZHAO Chuntao, et al. A Chinese medical question answering system based on knowledge graph[C]//2021 IEEE 15th International Conference on Big Data Science and Engineering. Shenyang: IEEE, 2021: 28-33.
13 侯梦薇,卫荣,陆亮,等.知识图谱研究综述及其在医疗领域的应用[J].计算机研究与发展,2018,55(12):2587-2599.
doi: 10.7544/issn1000-1239.2018.20180623
HOUMengwei,WEIRong,LULiang,et al.Research review of knowledge graph and its application in medical domain[J].Journal of Computer Research and Development,2018,55(12):2587-2599.
doi: 10.7544/issn1000-1239.2018.20180623
14 WANG Jin, WANG Zhongyuan, ZHANG Dawei, et al. Combining knowledge with deep convolutional neural networks for short text classification[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne: ACM, 2017: 2915-2921.
15 ALAGHAI.Leveraging knowledge-based features with multilevel attention mechanisms for short Arabic text classification[J].IEEE Access,2022,10,51908-51921.
doi: 10.1109/ACCESS.2022.3175306
16 李博涵,向宇轩,封顶,等.融合知识感知与双重注意力的短文本分类模型[J].软件学报,2022,33(10):3565-3581.
LIBohan,XIANGYuxuan,FENGDing,et al.Short text classification model combining knowledge aware and dual attention[J].Journal of Software,2022,33(10):3565-3581.
17 HAN Xu, CAO Shulin, LV Xin, et al. OpenKE: an open toolkit for knowledge embedding[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Brussels: Association for Computational Linguistics, 2018: 139-144.
18 WANG Hongwei, ZHANG Fuzheng, XIE Xing, et al. DKN: deep knowledge-aware network for news recommendation[EB/OL]. (2018-01-30)[2023-01-30]. http://arxiv.org/abs/1801.08284.
19 ALSHUBAILY I. TextCNN with attention for text classification[EB/OL]. (2019-10-15)[2023-01-30]. https://arxiv.org/abs/2108.01921.
20 JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[EB/OL]. (2016-08-09)[2023-01-30]. https://arxiv.org/abs/1607.01759v1.
21 LAISiwei,XULiheng,LIUKang,et al.Recurrent convolutional neural networks for text classification[J].Proceedings of the AAAI Conference on Artificial Intelligence,2015,29(1):2267-2273.
[1] 桂梁,徐遥,何世柱,张元哲,刘康,赵军. 基于动态邻居选择的知识图谱事实错误检测方法[J]. 《山东大学学报(理学版)》, 2024, 59(7): 76-84.
[2] 孙承杰,李宗蔚,单丽莉,林磊. 一种基于核心论元的篇章级事件抽取方法[J]. 《山东大学学报(理学版)》, 2024, 59(7): 53-63.
[3] 王静红,吴芝冰,黄鹏,杨家腾,李笔. 基于元路径属性融合的异质网络表示学习[J]. 《山东大学学报(理学版)》, 2024, 59(3): 1-13.
[4] 牛泽群,李晓戈,强成宇,韩伟,姚怡,刘洋. 基于图注意力神经网络的实体消歧方法[J]. 《山东大学学报(理学版)》, 2024, 59(3): 71-80, 94.
[5] 那宇嘉,谢珺,杨海洋,续欣莹. 融合上下文的知识图谱补全方法[J]. 《山东大学学报(理学版)》, 2023, 58(9): 71-80.
[6] 卢婵,郭军军,谭凯文,相艳,余正涛. 基于文本指导的层级自适应融合的多模态情感分析[J]. 《山东大学学报(理学版)》, 2023, 58(12): 31-40, 51.
[7] 王静红,梁丽娜,李昊康,王熙照. 基于标记注意力机制的社区发现算法[J]. 《山东大学学报(理学版)》, 2022, 57(12): 1-12.
[8] 鲍亮,陈志豪,陈文章,叶锴,廖祥文. 基于双重多路注意力匹配的观点型阅读理解[J]. 《山东大学学报(理学版)》, 2021, 56(3): 44-53.
[9] 唐光远,郭军军,余正涛,张亚飞,高盛祥. 基于BERT与法条知识驱动的法条推荐方法[J]. 《山东大学学报(理学版)》, 2021, 56(11): 24-30.
[10] 阴爱英,林建洲,吴运兵,廖祥文. 融合图卷积神经网络的文本情感分类[J]. 《山东大学学报(理学版)》, 2021, 56(11): 15-23.
[11] 银温社,贺建峰. 基于深度学习的眼底图像出血点检测方法[J]. 《山东大学学报(理学版)》, 2020, 55(9): 62-71.
[12] 郝长盈,兰艳艳,张海楠,郭嘉丰,徐君,庞亮,程学旗. 基于拓展关键词信息的对话生成模型[J]. 《山东大学学报(理学版)》, 2019, 54(7): 68-76.
[13] 王文卿,撖奥洋,于立涛,张智晟. 自编码器与PSOA-CNN结合的短期负荷预测模型[J]. 《山东大学学报(理学版)》, 2019, 54(7): 50-56.
[14] 张芳芳,曹兴超. 基于字面和语义相关性匹配的智能篇章排序[J]. 山东大学学报(理学版), 2018, 53(3): 46-53.
[15] 秦静,林鸿飞,徐博. 基于示例语义的音乐检索模型[J]. 山东大学学报(理学版), 2017, 52(6): 40-48.
Full text



[1] 郭乔进,丁轶,李宁. 一种基于上下文信息的乳腺肿块ROI检测方法[J]. J4, 2010, 45(7): 70 -75 .
[2] 付海艳,卢昌荆,史开泉 . (F,F-)-规律推理与规律挖掘[J]. J4, 2007, 42(7): 54 -57 .
[3] 刘洪华 . 色散方程的交替分组迭代方法[J]. J4, 2007, 42(1): 19 -23 .
[4] 刘昆仑. 变结构pair copula模型在金融危机传染分析中的应用[J]. 山东大学学报(理学版), 2016, 51(6): 104 -110 .
[5] 汤晓宏1,胡文效2*,魏彦锋2,蒋锡龙2,张晶莹2,. 葡萄酒野生酿酒酵母的筛选及其生物特性的研究[J]. 山东大学学报(理学版), 2014, 49(03): 12 -17 .
[6] 袁瑞强,刘贯群,张贤良,高会旺 . 黄河三角洲浅层地下水中氢氧同位素的特征[J]. J4, 2006, 41(5): 138 -143 .
[7] 何海伦, 陈秀兰*. 变性剂和缓冲系统对适冷蛋白酶MCP-01和中温蛋白酶BP-01构象影响的圆二色光谱分析何海伦, 陈秀兰*[J]. 山东大学学报(理学版), 2013, 48(1): 23 -29 .
[8] 王碧玉,曹小红*. 算子矩阵的Browder定理的摄动[J]. 山东大学学报(理学版), 2014, 49(03): 90 -95 .
[9] 胡选子1, 谢存禧2. 基于人工免疫网络的机器人局部路径规划[J]. J4, 2010, 45(7): 122 -126 .
[10] 郭文鹃,杨公平*,董晋利. 指纹图像分割方法综述[J]. J4, 2010, 45(7): 94 -101 .