JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2017, Vol. 52 ›› Issue (6): 24-31.doi: 10.6040/j.issn.1671-9352.5.2016.023

Previous Articles     Next Articles

Model selection with likelihood ratio for sequence transfer learning

SUN Shi-chang1,2, LIN Hong-fei1, MENG Jia-na2*, LIU Hong-bo3   

  1. 1. School of Computer, Dalian University of Technology, Dalian 116023, Liaoning, China;
    2. School of Computer, Dalian Minzu University, Dalian 116600, Liaoning, China;
    3. Information Science and Technology College, Dalian Maritime University, Dalian 116026, Liaoning, China
  • Received:2016-10-11 Online:2017-06-20 Published:2017-06-21

Abstract: To solve the under-adaptation problem of transfer learning,in this paper the granular model is used as a set of candidate models, and labeling rules contained in minor for target domain models is introduced by a model selection method. We propose a Likelihood Ratio based Model Selection method(LRMS)for the inference of granular model, which implements the fusion of minor models with the granular model. LRMS keeps the single-path calculating of Viterbi-based sequence labeling model, which avoid the violation of contextual connections. In empirical experiments on part-of-speech tagging, LRMS improves the accuracy in every transfer learning task, therefore, the effectiveness of LRMS in solving the under-adaptation problem is verified.

Key words: transfer learning, likelihood ratio, part-of-speech tagging, model selection

CLC Number: 

  • TP391
[1] PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10):1345-1359.
[2] PEDRYCZ W, RUSSO B, SUCCI G. Knowledge transfer in system modeling and its realization through an optimal allocation of information granularity[J]. Applied Soft Computing, 2012, 12(8):1985-1995.
[3] 龙明盛. 迁移学习问题与方法研究[D]. 北京:清华大学, 2014. LONG Mingsheng. Transfer learning: problems and methods[J].Beijing: Tsinghua University, 2014.
[4] LOPER E, BIRD S. NLTK: the natural language toolkit[C] // Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics. Stroudsburg: Association for Computational Linguistics, 2006: 69-72.
[5] RABINER L. A tutorial on hidden Markov models and selected applications in speech recognition[J]. Proceedings of the IEEE, 1989, 77(2):257-286.
[6] SUTTON C, MCCALLUM A. Composition of conditional random fields for transfer learning[C] // Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2005: 748-754.
[7] BRANTS T. TnT: a statistical part-of-speech tagger[C] // Proceedings of the 6th Conference on Applied Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2000: 224-231.
[8] AIT-MOHAND K, PAQUET T, RAGOT N. Combining structure and parameter adaptation of HMMs for printed text recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(9):1716-1732.
[9] KIM N S, SUNG J S, HONG D H. Factored MLLR adaptation[J]. Signal Processing Letters, 2011, 18(2):99-102.
[10] SONG M, PEDRYCZ W. Granular neural networks: concepts and development schemes[J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(4):542-553.
[11] LIM C H, VATS E, CHAN C S. Fuzzy human motion analysis: a review[J]. Pattern Recognition, 2015, 48(5):1773-1796.
[12] CAI Q, ZHANG D, ZHENG W, et al. A new fuzzy time series forecasting model combined with ant colony optimization and auto-regression[J]. Knowledge-Based Systems, 2015, 74:61-68.
[13] SUN B, GUO H, KARIMI H R, et al. Prediction of stock index futures prices based on fuzzy sets and multivariate fuzzy time series[J]. Neurocomputing, 2015, 151:1528-1536.
[14] SIDDIQI S M, GORDON G J, MOORE A W. Fast state discovery for HMM model selection and learning[J] //AISTATS. 2007, 2:492-499.
[15] WANG S, ZHAO Y. Online Bayesian tree-structured transformation of HMMs with optimal model selection for speaker adaptation[J]. IEEE Transactions on Speech and Audio Processing, 2001, 9(6):663-677.
[16] ESCALANTE H J, MONTES M, SUCAR L E. Particle swarm model selection[J]. Journal of Machine Learning Research, 2009, 10(Feb):405-440.
[17] GOLDWATER S, GRIFFITHS T. A fully Bayesian approach to unsupervised part-of-speech tagging[C] // Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2007, 45:744-751.
[18] PEDRYCZ W. Granular computing: analysis and design of intelligent systems [M]. Boca Raton: CRC Press, 2013.
[19] GAUVAIN J L, LEE C H. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains[J]. IEEE Transactions on Speech and Audio Processing, 1994, 2(2):291-298.
[1] YU Chuan-ming, FENG Bo-lin, TIAN Xin, AN Lu. Deep representative learning based sentiment analysis in the cross-lingual environment [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 13-23.
[2] MA Yun-yan, LUAN Yi-hui*. Detecting sparse signal segments by local LRS method [J]. J4, 2012, 47(12): 1-5.
[3] . Joint test in nonlinear regression models with AR(2) random errors [J]. J4, 2009, 44(7): 38-43.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!