山东大学学报(理学版) ›› 2017, Vol. 52 ›› Issue (7): 91-96.doi: 10.6040/j.issn.1671-9352.1.2016.019
陈敬,李寿山*,周国栋
CHEN Jing, LI Shou-shan*, ZHOU Guo-dong
摘要: 传统的年龄回归方法不能学习深层次信息,因此利用能充分挖掘上下文关系信息的深度学习方法来识别用户的年龄。具体而言,提出了一种基于LSTM的年龄回归方法,其能够学习长期依赖关系即建立输入值之间的长相关联系。采用了两种不同的特征,即文本特征和社交特征。为了有效地区分这两种特征,充分利用这两种特征之间的信息,进一步提出了基于双通道LSTM的年龄回归方法,具体实现是在神经网络中加入Merge层,将LSTM分别产生的文本特征表示和社交特征表示结合进行集成学习以充分学习文本特征和社交特征间的联系。实验结果表明,基于双通道LSTM的年龄回归方法能够有效地区分文本特征和社交特征,并且较单个LSTM方法能够取得更好的年龄回归性能。
中图分类号:
[1] SCHLER J, KOPPEL M, ARGAMON S, et al. Effects of age and gender on blogging[J]. Frontiers of Information Technology & Electronic Engineering, 2006, 274(s 1/2):199-205. [2] BURGER J D, HENDERSON J C. An exploration of observable features related to blogger age[C] //Proceedings of AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs. California: AAAI Press, 2006: 15-20. [3] NGUYEN D, SMITH N A, ROSE C P. Author age prediction from text using liner regression[J]. Acl-hlt Workshop on Language Technology for Cultural Heritage, 2011, 39(4): 115-123. [4] NGUYEN D P, GRAVEL R, TRIESCHNIGG R B, et al. “How old do you think i am?”: a study of language and age in twitter[C] //Proceedings of AAAI Conference on Weblogs and Social Media. California: AAAI Press, 2013: 439-448. [5] IKEDA D, TAKAMURA H, OKUMURA M. Semi-supervised learning for blog classification[C] //Proceedings of the 23rd AAAI Conference on Artificial Intelligence. California: AAAI Press, 2008, 89: 1156-1161. [6] ROSENTHAL S, MCKEOWN K. Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations[C] //Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2011: 763-772. [7] MACKINNON I, WARREN R H. Age and geographic inferences of the live journal social network[J]. Springer Berlin Heidelberg, 2007, 4503: 176-178. [8] PEERSMAN C, DAELEMANS W, VAERENBERGH L V. Predicting age and gender in online social networks[C] //Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents SMUC-11. New York: ACM, 2011: 37-44. [9] Marquardt J, Farnadi G, Vasudevan G, et al. Age and gender identification in social media[C] //Proceedings of CLEF 2014 Evaluation Labs Pages. Sheffield: CEUR Workshop Proceedings, 2014: 1129-1136. [10] 薛云霞. 微博用户属性识别方法研究[D]. 苏州: 苏州大学研究生院, 2015. XUE Yunxia. Research on microblog user atributes recognition[D]. Suzhou: Graduate School of Soochow University, 2015. [11] CHEN Jing, LI Shoushan, DAI Bin, et al. Active learning for age regression in social media[C] //Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Berlin: Springer, 2016: 351-362. [12] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. [13] GRAVES A. Generating sequences with recurrent neural networks[J]. Computer Science, 2013(1308.0850): 1-43. [14] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. ResearchGate, 2012, 3(4): 212-223. [15] LECUN Y A, BOTTOU L, ORR G B, et al. Efficient backprop[M]. Berlin: Springer, 2012: 9-48. [16] CAMERON A C, WINDMEIJER F A G. R-squared measures for count data regression models with applications to health-care utilization[J]. Journal of Business and Economic Statistics, 1996, 14(2): 209-220. |
[1] | 严倩,王礼敏,李寿山,周国栋. 结合新闻和评论文本的读者情绪分类方法[J]. 山东大学学报(理学版), 2018, 53(9): 35-39. |
[2] | 杨艳,徐冰,杨沐昀,赵晶晶. 一种基于联合深度学习模型的情感分类方法[J]. 山东大学学报(理学版), 2017, 52(9): 19-25. |
|