J4 ›› 2013, Vol. 48 ›› Issue (11): 53-58.

• Articles • Previous Articles     Next Articles

Chinese spam microblog filtering based on the fusion of
multi-angle features

YU Ran 1,2, LIU Chun-yang3*, JIN Xiao-long 1, WANG Yuan-zhuo 1, CHENG Xue-qi 1   

  1. 1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;
    2. University of Chinese Academy of Sciences, Beijing 100190, China;
    3. National Computer Network Emergency Response Technical Team Coordination Center of China, Beijing 100029, China
  • Received:2013-09-02 Online:2013-11-20 Published:2013-11-25

Abstract:

As microblog contains valuable information, data analysis on microblog such as topic detection has become a research hotspot. Due to the high flexibility of microblog′s content and form, noisy data is a big challenge for microblog analysis. Therefore, no effective method has been developed for nonpublic topic Chinese spam microblog filtering until now. To fill this gap, a new method was proposed to fuse multi-angle features extracted from both the content and structure of microblog. The fused features were then employed for filtering spam microblog with classifiers. Experiments on real data demonstrate that the fusion of multi-angle features can effectively improve the performance of spam filtering.

Key words: spam microblog filtering; feature selection; multi-angle features fusion

CLC Number: 

  • TP391
[1] LIU Ya-hui1, 2, LIU Chun-yang3*, ZHANG Tie-ying1, CHENG Xue-qi1. An overview of graph indexing technology [J]. J4, 2013, 48(11): 44-52.
[2] ZHENG Jian-xing, ZHANG Bo-feng*, YUE Xiao-dong, CHENG Ze-yu. Research on themes recommendation in microblogging
scenario based on neighbor-user profile
[J]. J4, 2013, 48(11): 59-65.
[3] PENG Qing-xi, QIAN Tie-yun. Store review spam detection based on quantitative sentiment [J]. J4, 2013, 48(11): 66-72.
[4] HUANG Liang, DU Yong-ping. The method of latent friend recommendation based on the trust relations [J]. J4, 2013, 48(11): 73-79.
[5] ZHANG Nai-zhou1, CAO Wei 2, CHEN Ke-rui 1, LI Shi-jun3. A temporal-aware model for search engine [J]. J4, 2013, 48(11): 80-86.
[6] CHEN Ke-rui, PAN Jun. Multi-source data fusion based on the expand vector space model [J]. J4, 2013, 48(11): 87-92.
[7] FANG Zhi-jun, LIU Xin-yun, WU Shi-qian, ZHENG Wen-juan. The multi-scale retinex algorithm for image enhancement based on
sub-band weighting fusion
[J]. J4, 2013, 48(11): 93-98.
[8] LIU Wu-ying, YI Mian-zhu, ZHANG Xing. A space-time-efficient multi-category text categorization algorithm [J]. J4, 2013, 48(11): 99-104.
[9] LI Yu-Qian, LIU Lin, LI Jin-Bing. Superposition principle of gray histograms in video analysis [J]. J4, 2009, 44(11): 63-67.
[10] XIE Hua, LIN Chang-Yuan, LIN Xue-Fang. Onedirection rough relations and security of data communication [J]. J4, 2009, 44(9): 93-96.
[11] XU Jie-ping1, YIN Hong-yu1, FAN Zi-wen2. Study on cover songs identification based on phrase content [J]. J4, 2013, 48(7): 68-71.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!