JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2019, Vol. 54 ›› Issue (3): 56-66.doi: 10.6040/j.issn.1671-9352.1.2018.100

Tag recommendation with multi-source heterogeneous networked information

Heng-ze BAO(),Dong ZHOU*(),Tan WU   

  1. School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, Hunan, China
  • Received:2018-10-17 Online:2019-03-01 Published:2019-03-19
Tags have been utilized extensively to associate various online resources, such as articles, images and movies, aiming at helping users understand and facilitate the process of managing and indexing huge web resources. Since it is time-consuming and prone for errors to create manual tags for these resources, automatic tag recommendation techniques have attracted widespread attention. At present, most tag recommendation methods mainly recommend tags by mining content information of resources. However, Most data information in the real world do not exist independently. For example, science articles have a complex network structure by referencing each other. The research show that the topology information and text content information of resources describe the similar semantic features of re-sources from two different perspectives, and the information from two aspects can complement and explain for each other. Based on this, we propose a probabilistic topic model and a tag recommendation method for simultaneously modeling content information and topology structure information of resource. This method uses multi-source heterogeneous information, such as tagging relationship between tag and resource content and link relationship between resources to mine potential semantic information of the resources to recommend several tags with similar functional semantics for the new resources. The experimental results on two real data sets prove the effectiveness of our proposed method.

Key words: tag, tag recommendation, topic model, heterogeneous network

An article in the CiteULike website"


Search results by using keywords "mesophase" and "petroleum""

Table 1

Common symbols and their meanings"

符号 含义说明
C 表示文章C
C 表示文章C
W(c) 表示文章C的文本内容
W(c′) 表示文章C′的文本内容
θ(c) 表示W(c)的主题分布向量
θ(c′) 表示W(c′)的主题分布向量
T 表示模型中主题的数量
W 语料库中词汇数量
Wt 语料库中标签数量
ϕ 维度为W的向量表示主题下词的分布
ϕt 维度为Wt的向量表示主题下标签的分布
N 根据文章间相似性选择与候选文章相似度最高的N篇文章
M 通过标签过滤算法选出得分最高的M个标签
S 未排序的待推荐标签集
S 排序后的待推荐标签集


TRTM model"


The execution framework of tag recommendation approach"

Table 2

Detailed data information"

数据集 数据 数量
citeulike-a 文章 16980
标签 19107
移除使用次数少于5次后剩余标签 7450
引用关系 294072
citeulike-t 文章 25975
标签 52946
移除使用次数少于5次后剩余标签 8311
引用关系 180103


Recall at top 50 tags changes over topic numbers(T) and iteration times(Iter)"


Experiment results on dataset citeulike-a"


Experiment results on dataste citeulike-t"


Experiment results on dataste citeulike-a"


Experiment results on dataste citeulike-t"

Table 3

An example of tag recommendation for article"

模型 主题最相近文章 推荐标签 正确标签
RTM “an efficient algorithm to rank web resources”
“rank algorithm web graph link”
“searching social networks”
“thermal barrier coatings for gas-turbine engine applications”
pager,resource,scale,web,rank,search,citation,engine,hyperlink,application web, rank, engine
TRTM “the anatomy of a large scale hyper textual web search engine”
“an overview of audio information retrieval”
“pager citation ranking bringing order to web”
“searching social networks”
engine,retrieval,link,hyperlink,search,citation,rank,web,relevance,citation engine,link,search,rank,web
