引用本文: | 邹鹏,于渤,王宪全.面向数据漂移的代价敏感客户细分[J].哈尔滨工业大学学报,2011,43(1):119.DOI:10.11918/j.issn.0367-6234.2011.01.024 |
| ZOU Peng,YU Bo,WANG Xian-quan.Cost-sensitive learning method with data drift in customer segmentation[J].Journal of Harbin Institute of Technology,2011,43(1):119.DOI:10.11918/j.issn.0367-6234.2011.01.024 |
|
摘要: |
为解决数据挖掘中存在的数据漂移和客户价值分布不平衡问题,采用了分阶段聚类和代价敏感支持向量机的新方法.新方法首先对全部客户聚类得到特征相似的客户群,然后用某个区域客户属于某客户群的后验概率对城市进行聚类,具有相似后验概率分布的城市群被认为是具有类似的客户结构,每个城市群的客户组成了新的客户样本,对每个样本分别进行代价敏感分类,并完成客户细分.对比实验表明,该方法提高整体预测准确率和高价值客户识别能力,降低模型错误分类代价.改进的方法能在保证分类准确率的同时,更有助于企业锁定高端客户,动态地调整区域市场战略. |
关键词: 代价敏感学习 支持向量机 客户细分 数据漂移 |
DOI:10.11918/j.issn.0367-6234.2011.01.024 |
分类号:TP311.13 |
基金项目:国家自然科学基金资助项目(70802019) |
|
Cost-sensitive learning method with data drift in customer segmentation |
ZOU Peng, YU Bo, WANG Xian-quan
|
School of Management,Harbin Institute of Technology,150001 Harbin,China
|
Abstract: |
To solve the problem of data drift and asymmetric misclassification costs in customer segmentation, a cost sensitive learning method integrated with two-step cluster is proposed. This method firstly applied kmeans cluster by the posterior probability distribution of give region to group similar regions together,and then used cost-sensitive support vector machine to find customer segmentation for each region-group. The results show that the cluster based on similarity of customer segmentation structure can improve the total accuracy and the proposed cost-sensitive support vector machine is an effective method to distinguish high value customers compared to the original support vector machine. |
Key words: cost-sensitive learning support vector machine customer segmentation data drift |