您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2017, Vol. 52 ›› Issue (7): 91-96.doi: 10.6040/j.issn.1671-9352.1.2016.019

• • 上一篇    下一篇

基于双通道LSTM的用户年龄识别方法

陈敬,李寿山*,周国栋   

  1. 苏州大学自然语言处理实验室, 江苏 苏州 215006
  • 收稿日期:2016-12-09 出版日期:2017-07-20 发布日期:2017-07-07
  • 通讯作者: 李寿山(1980— ),男,博士,教授,研究方向为自然语言处理. E-mail: shoushan.li @gmail.com E-mail:jing.chen199225@gmail.com
  • 作者简介:陈敬(1992— ),男,硕士研究生,研究方向为自然语言处理. E-mail: jing.chen199225@gmail.com
  • 基金资助:
    国家自然科学基金重点资助项目(61331011);国家自然科学基金资助项目(61375073,61273320)

User age regression with dual-channel LSTM

CHEN Jing, LI Shou-shan*, ZHOU Guo-dong   

  1. Natural Language Processing Lab, Soochow University, Suzhou 215006, Jiangsu, China
  • Received:2016-12-09 Online:2017-07-20 Published:2017-07-07

摘要: 传统的年龄回归方法不能学习深层次信息,因此利用能充分挖掘上下文关系信息的深度学习方法来识别用户的年龄。具体而言,提出了一种基于LSTM的年龄回归方法,其能够学习长期依赖关系即建立输入值之间的长相关联系。采用了两种不同的特征,即文本特征和社交特征。为了有效地区分这两种特征,充分利用这两种特征之间的信息,进一步提出了基于双通道LSTM的年龄回归方法,具体实现是在神经网络中加入Merge层,将LSTM分别产生的文本特征表示和社交特征表示结合进行集成学习以充分学习文本特征和社交特征间的联系。实验结果表明,基于双通道LSTM的年龄回归方法能够有效地区分文本特征和社交特征,并且较单个LSTM方法能够取得更好的年龄回归性能。

关键词: 年龄回归, LSTM, 社交特征, 文本特征

Abstract: Traditional age regression approach cant learn context relation, so we utilize deep learning approach which could make full use of context relation to predict users age. Specific implementation is that we propose a age regression approach based on LSTM. LSTM can learn long short-term memory, namely building long relevant connection between input values. We utilize two different features, namely textual and social features. In order to distinguish the two features and make full use of them, we propose a new age regression approach based on dual-channel LSTM. Specific implementation is to add a Merge layer into LSTM, combing text features representation and social features representation generated by LSTM, to fully learn knowledge between textual and social features. Experimental results show that our method can effectively distinguish textual and social features and improve the performance of age regression.

Key words: LSTM, textual features, age regression, social features

中图分类号: 

  • TP391
[1] SCHLER J, KOPPEL M, ARGAMON S, et al. Effects of age and gender on blogging[J]. Frontiers of Information Technology & Electronic Engineering, 2006, 274(s 1/2):199-205.
[2] BURGER J D, HENDERSON J C. An exploration of observable features related to blogger age[C] //Proceedings of AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs. California: AAAI Press, 2006: 15-20.
[3] NGUYEN D, SMITH N A, ROSE C P. Author age prediction from text using liner regression[J]. Acl-hlt Workshop on Language Technology for Cultural Heritage, 2011, 39(4): 115-123.
[4] NGUYEN D P, GRAVEL R, TRIESCHNIGG R B, et al. “How old do you think i am?”: a study of language and age in twitter[C] //Proceedings of AAAI Conference on Weblogs and Social Media. California: AAAI Press, 2013: 439-448.
[5] IKEDA D, TAKAMURA H, OKUMURA M. Semi-supervised learning for blog classification[C] //Proceedings of the 23rd AAAI Conference on Artificial Intelligence. California: AAAI Press, 2008, 89: 1156-1161.
[6] ROSENTHAL S, MCKEOWN K. Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations[C] //Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2011: 763-772.
[7] MACKINNON I, WARREN R H. Age and geographic inferences of the live journal social network[J]. Springer Berlin Heidelberg, 2007, 4503: 176-178.
[8] PEERSMAN C, DAELEMANS W, VAERENBERGH L V. Predicting age and gender in online social networks[C] //Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents SMUC-11. New York: ACM, 2011: 37-44.
[9] Marquardt J, Farnadi G, Vasudevan G, et al. Age and gender identification in social media[C] //Proceedings of CLEF 2014 Evaluation Labs Pages. Sheffield: CEUR Workshop Proceedings, 2014: 1129-1136.
[10] 薛云霞. 微博用户属性识别方法研究[D]. 苏州: 苏州大学研究生院, 2015. XUE Yunxia. Research on microblog user atributes recognition[D]. Suzhou: Graduate School of Soochow University, 2015.
[11] CHEN Jing, LI Shoushan, DAI Bin, et al. Active learning for age regression in social media[C] //Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Berlin: Springer, 2016: 351-362.
[12] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[13] GRAVES A. Generating sequences with recurrent neural networks[J]. Computer Science, 2013(1308.0850): 1-43.
[14] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. ResearchGate, 2012, 3(4): 212-223.
[15] LECUN Y A, BOTTOU L, ORR G B, et al. Efficient backprop[M]. Berlin: Springer, 2012: 9-48.
[16] CAMERON A C, WINDMEIJER F A G. R-squared measures for count data regression models with applications to health-care utilization[J]. Journal of Business and Economic Statistics, 1996, 14(2): 209-220.
[1] 严倩,王礼敏,李寿山,周国栋. 结合新闻和评论文本的读者情绪分类方法[J]. 山东大学学报(理学版), 2018, 53(9): 35-39.
[2] 杨艳,徐冰,杨沐昀,赵晶晶. 一种基于联合深度学习模型的情感分类方法[J]. 山东大学学报(理学版), 2017, 52(9): 19-25.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!