引用本文: | 张随雨,杨成.一种多标签统一域嵌入的推荐模型[J].哈尔滨工业大学学报,2020,52(5):179.DOI:10.11918/201904214 |
| ZHANG Suiyu,YANG Cheng.A multi-label unified domain embedding model for recommender[J].Journal of Harbin Institute of Technology,2020,52(5):179.DOI:10.11918/201904214 |
|
摘要: |
协同过滤是一种简单的运用关联知识的推荐方法,但在数据稀疏度高的情况下效果不尽人意.因子分解机解决了数据稀疏情况下的特征组合问题,再结合深度神经网络对高阶特征的提取,一系列深度学习预估模型被提出并取得较好效果.但这类模型主要受益于大量知识标签组合以及高阶特征理解,当数据标签类别稀少时其性能严重退化.为解决稀疏数据且稀少标签类别情境下的推荐问题,本文提出一种多标签统一域嵌入方法,并进一步设计实现了统一域嵌入的推荐模型.特征标签首先以领域划分并通过嵌入层转化为特征向量,然后基于特征空间表达的映射层将特征向量由当前域嵌入到统一域,最后对统一域向量进行空间关联运算并预测评分.采用近年来优异的深度学习预估模型作为对比模型,在多个主流开放数据集上进行了预测.实验结果表明,多标签统一域嵌入模型在推荐精度及性能上优于其它模型,它能够克服神经网络训练中的瓶颈,为数据标签稀缺情境下的推荐系统提供可行的解决方案. |
关键词: 推荐系统 深度学习 因子分解机 协同过滤 稀疏 统一域 |
DOI:10.11918/201904214 |
分类号:TP181 |
文献标识码:A |
基金项目:中国传媒大学重大攻关培育项目(CUC19ZD003);中国传媒大学优秀博导组项目(CUC2019A009);北京高校“高精尖”学科建设项目(CUC190J054) |
|
A multi-label unified domain embedding model for recommender |
ZHANG Suiyu,YANG Cheng
|
(School of Information and Communication Engineering, Communication University of China, Beijing 100024, China)
|
Abstract: |
Collaborative filtering is a simple recommendation method which uses related knowledge, but its performance is poor when the data is highly sparse. Factorization machines (FM) can solve the problem of feature combination in the case of data sparsity. Combining with high-order feature extraction via deep neural networks, a series of deep learning prediction models have been proposed and good results have achieved. However, these models mainly depend on the combination of a large number of labels and the understanding of high-order features, whose performance can be seriously degraded when the label categories of data are scarce. In order to solve the problem of recommendation in sparse data and scarce labels, a novel neural network-based recommendation model was proposed which embeds multi-domain labels into a unified domain. First, labels were divided by domain and transformed into feature vectors through embedding layer. Then, the mapping layer was used to embed the feature vectors from the current domain into the unified domain. Finally, the spatial relations of the unified domain-embedded vectors were calculated and predictions were made. Experiments on several open datasets show that the proposed model had higher accuracy and better performance than the mainstream neural network based predicting models. The model overcomes the training bottleneck when label is scarce, and provides a solution for recommender system with limited original data. |
Key words: recommender system deep learning factorization machine collaborative filtering sparse unified domain |