《山东大学学报(理学版)》 ›› 2020, Vol. 55 ›› Issue (1): 102-109.doi: 10.6040/j.issn.1671-9352.2.2019.076
Ni LI1(),Huan-mei GUAN2,*(),Piao YANG2,Wen-yong DONG2
摘要:
预训练语言模型能够表达句子丰富的句法和语法信息,并且能够对词的多义性建模,在自然语言处理中有着广泛的应用,BERT(bidirectional encoder representations from transformers)预训练语言模型是其中之一。在基于BERT微调的命名实体识别方法中,存在的问题是训练参数过多,训练时间过长。针对这个问题提出了基于BERT-IDCNN-CRF(BERT-iterated dilated convolutional neural network-conditional random field)的中文命名实体识别方法,该方法通过BERT预训练语言模型得到字的上下文表示,再将字向量序列输入IDCNN-CRF模型中进行训练,训练过程中保持BERT参数不变,只训练IDCNN-CRF部分,在保持多义性的同时减少了训练参数。实验表明,该模型在MSRA语料上F1值能够达到94.41%,在中文命名实体任务上优于目前最好的Lattice-LSTM模型,提高了1.23%;与基于BERT微调的方法相比,该方法的F1值略低但是训练时间大幅度缩短。将该模型应用于信息安全、电网电磁环境舆情等领域的敏感实体识别,速度更快,响应更及时。
中图分类号:
1 | HAMMERTON J. Named entity recognition with long short-term memory[C]// Conference on Natural Language Learning at HLT-NAACL. NJ: Association for Computational Linguistics, 2003. |
2 | LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[J/OL]. arXiv: 1603.01360[cs]. 2016. |
3 | MA X, HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[J/OL]. arXiv: 1603.01354[cs]. 2016. |
4 | CHIU J P C , NICHOLS E . Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, (4): 357- 370. |
5 | DONG C H , ZHANG J J , ZONG C Q , et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[M]. Cham: Springer, 2016: 239- 250. |
6 | HE J, WANG H. Chinese named entity recognition and word segmentation based on character[C]// Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing.[S.l.]: [s.n.], 2008. |
7 | LIU Z X, ZHU C H, ZHAO T J. Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words?[M]//Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. Berlin: Springer, 2010: 634-640. |
8 | LI H, HAGIWARA M, LI Q, et al. Comparison of the impact of word segmentation on name tagging for Chinese and Japanese[C]// LREC.[S.l.]: [s.n.], 2014: 2532-2536. |
9 | CHEN W, ZHANG Y, ISAHARA H. Chinese named entity recognition with conditional random fields[C] // Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing.[S.l.]: [s.n.], 2006: 118-121. |
10 | LU Y, ZHANG Y, and JI D. Multi-prototype Chinese character embedding[C]// LREC, Berlin: Springer, 2016. |
11 | ZHOU J S , QU W G , ZHANG F . Chinese named entity recognition via joint identification and categorization[J]. Eleetron, 2013, (22): 225- 230. |
12 | ZHAO H, KIT C. Unsupervised segmentation helps supervised learning of character tagging for word segmentation and named entity recognition[C]// Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing. Berlin: Springer, 2008. |
13 | PENG N, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C] // Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. PA: Association for Computational Linguistics, 2015: 548-554. |
14 | HE H, SUN X. F-Score driven max margin neural network for named entity recognition in Chinese social media[J/OL]. arXiv: 1611.04234[cs], 2016. |
15 | ZHANG Y, YANG J. Chinese NER using lattice LSTM[J/OL]. arXiv: 1805.02023[cs], 2018. |
16 | COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J/OL]. arXiv: 1103.0398[cs]. 2011. |
17 | STRUBELL E, VERGA P, Belanger D, et al. Fast and accurate entity recognition with iterated dilated convolutions[J/OL]. arXiv: 1702.02098[cs], 2017. |
18 | REI M. Semi-supervised multitask learning for sequence labeling[J/OL]. arXiv: 1704.07156[cs], 2017. |
19 | DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J/OL]. arXiv: 1810.04805[cs], 2018. |
20 | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J/OL]. arXiv: 1511.07122[cs], 2015. |
[1] | 丁义涛,杨海滨,杨晓元,周潭平. 一种同态密文域可逆隐藏方案[J]. 山东大学学报(理学版), 2017, 52(7): 104-110. |
[2] | 康海燕,马跃雷. 差分隐私保护在数据挖掘中应用综述[J]. 山东大学学报(理学版), 2017, 52(3): 16-23. |
[3] | 吴志军,沈丹丹. 基于信息综合集成共享的下一代网络化全球航班追踪体系结构及关键技术[J]. 山东大学学报(理学版), 2016, 51(11): 1-6. |
[4] | 何炎祥, 刘健博, 孙松涛, 文卫东. 基于层叠条件随机场的微博商品评论情感分类[J]. 山东大学学报(理学版), 2015, 50(11): 67-73. |
[5] | 张晶, 薛冷, 崔毅, 容会, 王剑平. 基于无线传感器网络的双混沌数据加密算法建模与评价[J]. 山东大学学报(理学版), 2015, 50(03): 1-5. |
[6] | 潘清清,周枫,余正涛,郭剑毅,线岩团. 基于条件随机场的越南语命名实体识别方法[J]. 山东大学学报(理学版), 2014, 49(1): 76-79. |
[7] | 康海燕, 杨孔雨, 陈建明. 基于K-匿名的个性化隐私保护方法研究[J]. 山东大学学报(理学版), 2014, 49(09): 142-149. |
[8] | 黄景文. 信息安全风险因素分析的模糊群决策方法研究[J]. J4, 2012, 47(11): 45-49. |
|