您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

J4 ›› 2012, Vol. 47 ›› Issue (5): 73-77.

• 电子技术与信息 • 上一篇    下一篇

Twitter 数据采集方案研究

房伟伟1,2,李静远1,刘悦1,余智华1,曹鹏1,2,张凯1   

  1. 1. 中国科学院计算技术研究所, 北京 100190; 2. 中国科学院研究生院, 北京 100190
  • 收稿日期:2011-11-30 出版日期:2012-05-20 发布日期:2012-06-01
  • 作者简介:房伟伟(1989- ),男,硕士研究生,现研究方向为网络安全.Email: fangweiwei@software.ict.ac.cn
  • 基金资助:

    国家信息安全专项项目(2010F032);国家“八六三”高技术研究发展计划基金项目(2010AA012500);自然科学基金重点项目(60933005)

Research of Twitter data collection

FANG Wei-wei1,2, LI Jing-yuan1, LIU Yue1, YU Zhi-hua1, CAO Peng1,2, ZHANG Kai1   

  1. 1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;
    2. Graduate University of Chinese Academy of Sciences, Beijing 100190, China
  • Received:2011-11-30 Online:2012-05-20 Published:2012-06-01

摘要:

为了能够实时、高效地获取Twitter数据,在分析了传统采集方法的缺陷后,提出了基于Twitter List API和Lookup API的用户数据采集方案。该方案通过对用户进行分类,进而精确控制API的调用频率。经在超过26万Twitter用户和600万条消息的一系列实验证明,通过两套方案的结合可以实现Twitter用户数据高效实时的获取。

关键词: Twitter;List API;Lookup API;数据采集

Abstract:

In order to achieve  real-time and efficient access to the data of Twitter,two different methods based on Twitter List API and Lookup API were presented after analyzing the shortcomings of  traditional collection methods. By classifying users, this method can precisely  control the frequency of calling API. A series of experiments on over 260,000 users and over 6 million messages were carried out, and the results show that the combination of the two methods can be efficiently  used to collect Twitter data in real-time.

Key words: Twitter; List API; Lookup API; data collection

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!