基于BERT-IDCNN-CRF的中文命名实体识别方法

doi:10.6040/j.issn.1671-9352.2.2019.076

Abstract

Abstract:

The pre-trained language model, BERT (bidirectional encoder representations from transformers), has shown promising result in NER (named entity recognition) due to its ability to represent rich syntactic, grammatical information in sentences and the polysemy of words. However, most existing BERT fine-tuning based models need to update lots of model parameters, facing with expensive time cost at both training and testing phases. To handle this problem, this work presents a novel BERT based language model for Chinese NER, named BERT-IDCNN-CRF (BERT-iterated dilated convolutional neural network-conditional random field). The proposed model utilizes traditional BERT model to obtain the context representation of the word as the input of IDCNN-CRF. At training phase, the model parameters of BERT in the proposed model remain unchanged so that the proposed model can reduce parameters training while maintaining polysemy of words. Experimental results show that the proposed model obtains significant training time with acceptable test error.

Key words: NER in Chinese, BERT, IDCNN, CRF, information security

CLC Number:

TP391

Ni LI,Huan-mei GUAN,Piao YANG,Wen-yong DONG. BERT-IDCNN-CRF for named entity recognition in Chinese[J].JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(1): 102-109.

Figures/Tables 13

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Table 1

Table 2

Fig.6

Fig.7

Table 3

Table 4

Fig.8

Table 5

References 20

1	HAMMERTON J. Named entity recognition with long short-term memory[C]// Conference on Natural Language Learning at HLT-NAACL. NJ: Association for Computational Linguistics, 2003.
2	LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[J/OL]. arXiv: 1603.01360[cs]. 2016.
3	MA X, HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[J/OL]. arXiv: 1603.01354[cs]. 2016.
4	CHIU J P C , NICHOLS E . Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, (4): 357- 370.
5	DONG C H , ZHANG J J , ZONG C Q , et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[M]. Cham: Springer, 2016: 239- 250.
6	HE J, WANG H. Chinese named entity recognition and word segmentation based on character[C]// Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing.[S.l.]: [s.n.], 2008.
7	LIU Z X, ZHU C H, ZHAO T J. Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words?[M]//Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. Berlin: Springer, 2010: 634-640.
8	LI H, HAGIWARA M, LI Q, et al. Comparison of the impact of word segmentation on name tagging for Chinese and Japanese[C]// LREC.[S.l.]: [s.n.], 2014: 2532-2536.
9	CHEN W, ZHANG Y, ISAHARA H. Chinese named entity recognition with conditional random fields[C] // Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing.[S.l.]: [s.n.], 2006: 118-121.
10	LU Y, ZHANG Y, and JI D. Multi-prototype Chinese character embedding[C]// LREC, Berlin: Springer, 2016.
11	ZHOU J S , QU W G , ZHANG F . Chinese named entity recognition via joint identification and categorization[J]. Eleetron, 2013, (22): 225- 230.
12	ZHAO H, KIT C. Unsupervised segmentation helps supervised learning of character tagging for word segmentation and named entity recognition[C]// Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing. Berlin: Springer, 2008.
13	PENG N, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C] // Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. PA: Association for Computational Linguistics, 2015: 548-554.
14	HE H, SUN X. F-Score driven max margin neural network for named entity recognition in Chinese social media[J/OL]. arXiv: 1611.04234[cs], 2016.
15	ZHANG Y, YANG J. Chinese NER using lattice LSTM[J/OL]. arXiv: 1805.02023[cs], 2018.
16	COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J/OL]. arXiv: 1103.0398[cs]. 2011.
17	STRUBELL E, VERGA P, Belanger D, et al. Fast and accurate entity recognition with iterated dilated convolutions[J/OL]. arXiv: 1702.02098[cs], 2017.
18	REI M. Semi-supervised multitask learning for sequence labeling[J/OL]. arXiv: 1704.07156[cs], 2017.
19	DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J/OL]. arXiv: 1810.04805[cs], 2018.
20	YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J/OL]. arXiv: 1511.07122[cs], 2015.

Related Articles 15

[1]	. Hate speech detection based on pre-trained models [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2026, 61(3): 44-53.
[2]	YANG Yu, SUN Shengbo, XU Zirui, JIANG Xiaowei, SONG Qiang, DAI Hongwei. Hybrid mutation based gray wolf optimization algorithm for berth-quay crane scheduling [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2026, 61(1): 94-102.
[3]	YANG Xiaofei, XIAO Feihu, MA Yingcang, XIN Xiaolong. New characterizations of orthogonal modular lattices based on quantum logic [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2025, 60(5): 74-78.
[4]	LIU Ni, JIAO Hongying, WANG Yali. The Browders theorem of AB and BA [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2025, 60(12): 178-182.
[5]	WANG Tinghua, HU Zhenwei, ZHAN Hongxiang. A novel unsupervised feature selection method [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(12): 130-140.
[6]	LI Shou-wei, SHI Kai-quan. Inverse separated fuzzy set ((-overA)^F,(-overA)^(-overF)) and secure acquisition of fuzzy information [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(9): 1-14.
[7]	ZHENG Cheng-yu, WANG Xin, WANG Ting, DENG Ya-ping, YIN Tian-tian. Multi-label classification for medical text based on ALBERT-TextCNN model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(4): 21-29.
[8]	TAN Jin-yuan, DIAO Yu-feng, YANG Liang, QI Rui-hua, LIN Hong-fei. Extractive-abstractive text automatic summary based on BERT-SUMOPN model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(7): 82-90.
[9]	TANG Guang-yuan, GUO Jun-jun, YU Zheng-tao, ZHANG Ya-fei,GAO Sheng-xiang. Method of recommendation based on knowledge driven by BERT and law [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(11): 24-30.
[10]	XU Jiang-pei, WANG Jin, LIU Chang, ZHOU Liang, LONG Feng. Security detection of CAN bus protocol for electric vehicle and charging pile [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(5): 95-104.
[11]	CAO Hui-rong , ZHOU Wei, CHU Tong, ZHOU Jie. Dynamic analysis of Bertrand game model about taxation of government and subsidy [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(11): 52-62.
[12]	ZHANG Ke-yong, LI Jiang-xin, YAO Jian-ming, LI Chun-xia. Research on supply chain decision making of equitable retailers with fair sensitivities [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 83-94.
[13]	DING Yi-tao, YANG Hai-bin, YANG Xiao-yuan, ZHOU Tan-ping. A reversible image data hiding scheme in Homomorphic encrypted domain [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 104-110.
[14]	KANG Hai-yan, MA Yue-lei. Survey on application of data mining via differential privacy [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(3): 16-23.
[15]	DIAO Qun, SHI Dong-yang. New H ¹-Galerkin mixed finite element analysis for quasi-linear viscoelasticity equation [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(4): 90-98.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

数据集	地名	机构名	人名	共计
训练集	36 517	20 571	17 615	74 703
测试集	2 877	1 331	1 973	6 181

操作系统	Ubuntu
CPU	i7-6700HQ@2.60GHz
GPU	GTX 1070 (8 GB)
Python	3.6
Tensorflow	1.12.0
内存	32G

Models	Type	P	R	F1
BERT-IDCNN-CRF	LOC	96.32	93.81	95.05
	ORG	88.86	91.06	89.94
	PER	96.95	96.16	96.55
	ALL	94.86	93.97	94.41

	句子	中国政府陪同团
例句1	实体	中国政府陪同团-ORG
	预测实体	中国-LOC
	句子	委员会的安全任务更加繁重了
例句2	实体	委员会-ORG
	预测实体	无

Models	P	R	F1	Time(ep) /s
BiLSTM-CRF	88.80	87.16	87.97	416
IDCNN-CRF	89.39	84.64	86.95	209
Radical-BiLSTM-CRF	91.28	90.62	90.95	>410
Lattice-LSTM-CRF	93.57	92.79	93.18	7 506
BERT-fine-tuning	94.09	94.54	95.37	1 363
BERT-IDCNN-CRF	94.86	93.97	94.41	216

BERT-IDCNN-CRF for named entity recognition in Chinese

RichHTML

PDF (PC)