基于文本分块的多模板隐马尔可夫模型的文本信息抽取

• Articles • Previous Articles Next Articles

Using text blocks based on multiple templates hidden markov model for text information extraction

WANG Lei,CHEN Zhi-ping,LI Zhi-cheng

1. Department of Computer & Information Science, Fujian University of Technology, Fuzhou 350014, Fujian, China；

Received:2006-04-01 Revised:1900-01-01 Online:2006-10-24 Published:2006-10-24
Contact: WANG Lei

Abstract

Abstract: Since varied training data sources are not profitable for the learning of optimal model parameters, then a novel text information extraction algorithm based on hidden Markov model with multiple templates is proposed, which makes use of the information of format and list separators to segment text, and then extracts text information through combining theparameters of releasing probability for universal training, using multiple form templates to train the parameters of initial probability and transition probability for hidden Markov mode. Experimental results show better performance in precision and recall over simple hidden Markov model.

Key words: text block , multiple templates, hidden markov model, text information extraction

WANG Lei,CHEN Zhi-ping,LI Zhi-cheng . Using text blocks based on multiple templates hidden markov model for text information extraction[J].J4, 2006, 41(3): 19-24 .

References

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Using text blocks based on multiple templates hidden markov model for text information extraction

Abstract

Cite this article

share this article

References

Related Articles 1

Metrics

Comments

Recommended 0