Personalized recommendation of mobile users by integrating basic information and communication behavior

Xianjun WU1,2(),Shaoshi TANG2,Mingqiu WANG2,*()   

  1. 1. School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan 430073, Hubei, China
    2. School of Statistics and Data Science, Qufu Normal University, Jining 273165, Shandong, China
  Received:2021-12-24 Online:2023-09-20 Published:2023-09-08
Considering the factors which affect the mobile users' product ordering, this paper proposes a new matrix factorization model integrating the basic information and communication behavior of the mobile users, and compares the performance of the proposed method with the traditional model on the prediction of rating and Top-N recommendation. For the prediction of rating, some indices are adjusted in the RFM model by combining with the characteristics of mobile users in the product ordering process. Then a user-product rating matrix, constructed by the adjusted RFM model, can accurately and objectively reflect the users' preference for products. For the Top-N recommendation, a negative sampling method is adopted that popular products with no ordering behavior are preferred to be included in the negative samples. The numerical results show that the proposed method performs better.

Key words: mobile users, matrix factorization model, RFM model, rating prediction, Top-N recommendation

CLC Number: 

  • TP391

Table 1

Descriptions of target products"

产品类型 产品编号 产品名称 产品内容示例
流量类产品 1 流量扩容 每月消费20元, 享每月2 GB国内流量扩容
2 定向流量 9元15 GB某APP定向流量(可选择抖音、爱奇艺, 网易云等多款热门APP之一)
3 快消包 10元5 GB国内流量, 7 d内有效
4 假期包 暑期30元30 GB流量
5 模组升档 升50元流量模组, 享打折优惠或共享流量产品
语音类产品 6 语音扩容 每月消费25元, 享每月100 min国内语音扩容
7 语音翻番包 语音翻番包10元80 min
权益类产品 8 双V会员 30元包随心选会员+咪咕视频黄金会员+30 GB定向流量
9 车主会员 车主会员10元/月, 享免费洗车及加油打折优惠
10 尊享会员 车主会员3元/月, 享权益汇产品优惠
11 手机直播包 咪咕视频黄金会员+15 GB咪咕视频流量
12 美食生活礼包 15元享美团、饿了么会员任选其一+1 GB国内通用流量
13 长期提速包 达量限速套餐客户订购长期提速包可以解除限速
14 视频彩铃 为主叫用户提供有趣的视频媒体代替普通网络回铃音
15 点播年包 魔百和电影+电视剧组合年包199元

Table 2

Descriptions of users′ indices"

数据类型 字段名称 中文解释 来源表
用户基础属性信息 nat_age_year 用户年龄 dw_product_ageinfo_202012
sex_id 用户性别 dw_product_ageinfo_202012
user_online 在网时长 dw_cust_label_monthone_202012
用户通信行为信息 arpu_avg 实际消费金额均值 dw_newbusi_product_gprs
arpu_std 实际消费金额方差 dw_newbusi_product_gprs
dou_avg 流量均值 dw_newbusi_product_gprs
dou_std 流量方差 dw_newbusi_product_gprs
bhd_avg 饱和度均值 dw_product_bhd_user_mm
bhd_std 饱和度方差 dw_product_bhd_user_mm
dou_overfee_avg 流量超套费用均值 dw_acct_newbusi_mm
dou_overfee_std 流量超套费用方差 dw_acct_newbusi_mm
dou_overfee_cn 流量超套次数 dw_acct_newbusi_mm
mou_avg 语音均值 dw_newbusi_call_overpkg
mou_std 语音方差 dw_newbusi_call_overpkg
mou_overfee_avg 语音超套费用均值 dw_newbusi_call_overpkg
mou_overfee_std 语音超套费用方差 dw_newbusi_call_overpkg
mou_overfee_cn 语音超套次数 dw_newbusi_call_overpkg
用户订购行为信息 recency 最近订购时间 dw_product_priv_accept_ds
time 产品使用时长 dw_product_priv_accept_ds
frequency 业务订购频率 dw_product_priv_accept_ds
monetary 订购金额 dw_product_priv_accept_ds

Table 3

Significance of indices(p)"

用户指标 产品编号
1 2 3 4 5
nat_age_year 0.000 8*** 5.94e-05*** 0.220 < 2e-16*** 0.265
sex_id 0.001** 2.54e-12*** < 2e-16*** 7.68e-06*** < 2e-16***
user_online 0.548 1.72e-07*** 0.770 9.61e-05*** 0.103
arpu_avg 2.05e-15*** 0.010* 0.0003*** 0.192 0.237
arpu_std 0.064Δ 0.029* 2.68e-06*** 0.003** 0.014*
dou_avg 0.002** 3.49e-06*** 0.986 0.212 0.319
dou_std 0.649 0.008** 0.438 0.833 0.239
bhd_avg 0.000 7*** 0.516 0.162 0.397 0.239
bhd_std 0.000 7*** 0.570 0.141 0.330 0.213
overgprs_fee_avg 0.000 2*** 0.093Δ 0.378 0.307 0.114
overgprs_fee_std 0.028* 0.659 2.45e-05*** 0.193 0.000 6***
overgprs_flag 0.004** 0.457 < 2e-16*** 0.351 6.5e-05***
mou_avg 0.544 0.831 0.000 2*** 0.067Δ 0.051 1 Δ
mou_std 0.726 0.815 0.009** 0.742 0.010*
overcall_fee_avg 0.367 0.982 0.000 7*** 0.139 0.024*
overcall_fee_std 0.018* 0.144 0.000 5*** 0.195 0.002**
overcall_flag 2.53e-05*** 0.000 5*** 0.076Δ 0.352 0.004**
用户指标 产品编号
6 7 8 9 10
nat_age_year 0.005** 1.09e-07*** 0.021* 1.15e-07*** 0.001**
sex_id 0.000 2*** 5.61e-10*** < 2e-16*** 1.61e-12*** 5.88e-09***
user_online 0.494 0.867 0.0004*** 0.011* 0.003**
arpu_avg 4.66e-14*** 0.301 0.056Δ 0.523 1.12e-09***
arpu_std 0.731 0.010* 6.94e-05*** 0.166 0.513
dou_avg 0.010** 0.000 6*** 0.922 0.001** 0.496
dou_std 0.045* 0.773 0.647 0.002** 0.086Δ
bhd_avg 0.421 0.453 0.900 0.595 0.204
bhd_std 0.410 0.780 0.864 0.664 0.191
overgprs_fee_avg 0.219 0.883 0.313 0.110 0.002**
overgprs_fee_std 0.327 0.239 0.000 8*** 0.049* 0.633
overgprs_flag 0.094Δ 0.282 0.926 6.95e-10*** 0.026*
mou_avg 9.12e-11*** 0.853 0.854 0.009** 0.002**
mou_std 0.844 0.010** 0.705 0.077Δ 0.188
overcall_fee_avg 0.271 0.180 0.088Δ 0.084Δ 0.323
overcall_fee_std 0.011* 0.010** 0.443 0.232 0.060Δ
overcall_flag 0.035* 0.002** 0.235 0.035* 0.005**
用户指标 产品编号
11 12 13 14 15
nat_age_year 0.125 7.56e-13*** < 2e-16*** 2.04e-06*** 3.94e-06***
sex_id 0.011* 6.85e-08*** < 2e-16*** 1.40e-06*** < 2e-16***
user_online 0.425 9.23e-07*** 0.107 0.213 0.279
arpu_avg 0.368 0.034* 3.50e-06*** 0.042* 4.35e-12***
arpu_std 0.229 0.010** 7.19e-06*** 0.324 0.227
dou_avg 0.073Δ 0.210 8.77e-06*** 0.035* 1.20e-10***
dou_std 0.513 0.013* 0.080Δ 0.079Δ 0.002**
bhd_avg 0.619 0.599 0.018* 0.606 0.376
bhd_std 0.728 0.628 0.039* 0.619 0.211
overgprs_fee_avg 0.896 0.135 0.130 0.980 0.169
overgprs_fee_std 0.484 0.009** 0.047* 0.626 0.714
overgprs_flag 0.546 0.363 < 2e-16*** 0.0001*** 0.002**
mou_avg 0.804 0.076Δ 0.883 0.671 0.891
mou_std 0.973 0.266 0.974 0.429 0.581
overcall_fee_avg 0.937 0.155 0.084Δ 0.883 0.021*
overcall_fee_std 0.425 0.103 0.264 0.449 0.089Δ
overcall_flag 0.234 0.195 0.905 0.001** 2.19e-06***


Scatter plot of similarity"

Table 4

Calculation of R"

用户编号 产品编号 最后一次购买时间tui 设定最后日期 R
1 1 t11 2020-12-31 T-t11
1 2 t12 2020-12-31 T-t12
m n-1 tm(n-1) 2020-12-31 T-tm(n-1)
m n tmn 2020-12-31 T-tmn

Table 5

Calculation of F"

用户编号 产品编号 订购次数cui 产品总使用时长tui 产品缴费周期Ti F
1 1 c11 t11 T1 $\frac{t_{11}}{T_1}$
1 2 c12 t12 T2 $\frac{t_{12}}{T_2}$
1 3 c13 - - c13
m n-1 cm(n-1) tm(n-1) Tn-1 $\left[\frac{t_{m(n-1)}}{T_{n-1}}\right]$
m n cmn tmn Tn $\left[\frac{t_{m n}}{T_{n}}\right]$

Table 6

Calculation of M"

用户编号 产品编号 该产品订购总金额aui 所有产品订购总金额Au M
1 1 a11 A1 $\frac{a_{11}}{A_{1}}$
1 2 a12 A1 $\frac{a_{12}}{A_{1}}$
m n-1 am(n-1) Am $\frac{a_{m(n-1)}}{A_{m}}$
m n amn Am $\frac{a_{m n}}{A_{m}}$

Table 7

Weights of R, F and M"

指标 R F M
信息熵值 0.759 0.807 0.804
信息熵冗余度 0.241 0.193 0.196
权重 0.383 0.306 0.310

Table 8

Calculation of user-product rating matrix"

用户编号 产品编号 R F M 得分
1 1 R11 F11 M11 RweightR11+FweightF11+MweightM11
1 2 R12 F12 M12 RweightR12+FweightF12+MweightM12
m n-1 Rm(n-1) Fm(n-1) Mm(n-1) RweightRm(n-1)+FweightFm(n-1)+MweightMm(n-1)
m n Rmn Fmn Mmn RweightRmn+FweightFmn+MweightMmn

Table 9

Partial data of user-product rating matrix"

用户编号 产品类型
点播年包 美食生活礼包 快消包 模组升档
1 0 0 2.827 0
2 0 0 5.354 0
3 0 0 0 4.532
4 0 0 1.920 2.975
5 0 0 6.105 0
6 3.138 0 0 0
7 2.058 0 2.232 0
8 0 0 2.721 0
9 0 3.279 5.668 0
10 0 3.465 0 2.789

Table 10

Prediction results of rating"

K 传统的矩阵分解模型 改进的矩阵分解模型
2 0.350 0.255 0.182 0.141
4 0.252 0.199 0.144 0.121
6 0.196 0.166 0.124 0.114
8 0.152 0.136 0.116 0.111
10 0.147 0.135 0.113 0.111


Prediction results of rating"


Histogram of the number of product ordering users"

Table 11

Negative sampling results of users"

用户编号 产品正样本 ratio=1 ratio=2
1 {3, 5} {8, 2} {8, 2, 4, 10}
2 {2, 10} {5, 3} {5, 3, 8, 4}
3 {3, 8, 10} {5, 2, 4} {5, 2, 4, 14, 15, 9}
4 {3, 5} {8, 2} {8, 2, 4, 10}
5 {3, 8} {5, 2} {5, 2, 4, 10}
6 {4, 8, 10} {5, 3} {5, 3, 2, 14, 15, 9}
7 {5, 6, 8} {3, 2, 4} {3, 2, 4, 10, 14, 15}
8 {4, 5} {3, 8} {3, 8, 2, 10}
9 {3, 4} {5, 8} {5, 8, 2, 10}
10 {2, 5} {3, 8} {3, 8, 4, 10}

Table 12

Performance of Top-N recommendation"

模型类别 N ratio 精确率 召回率 F1 覆盖率 流行度
经典的矩阵分解模型 1 1 0.031 0.043 0.036 0.400 5.167
2 0.049 0.031 0.038 0.267 5.602
3 0.098 0.083 0.090 0.133 5.945
2 1 0.057 0.069 0.062 0.467 5.091
2 0.115 0.123 0.119 0.267 5.761
3 0.206 0.216 0.211 0.133 6.447
改进的矩阵分解模型 1 1 0.048 0.068 0.056 0.333 5.236
2 0.060 0.045 0.051 0.200 5.639
3 0.127 0.131 0.129 0.133 6.223
2 1 0.070 0.082 0.076 0.333 5.316
2 0.144 0.117 0.129 0.267 6.178
3 0.238 0.221 0.229 0.200 6.429

Table 13

Examples of recommendation"

用户A 用户B
推荐商品编号 兴趣度 推荐商品编号 兴趣度
6 0.817 3 0.789
3 0.804 5 0.769
