引用本文: | 刘晓芳,刘策,刘露咪,程丹松.重叠局部高斯过程回归[J].哈尔滨工业大学学报,2019,51(11):22.DOI:10.11918/j.issn.0367-6234.201904056 |
| LIU Xiaofang,LIU Ce,LIU Lumi,CHENG Dansong.Overlapped Local Gaussian Process Regression[J].Journal of Harbin Institute of Technology,2019,51(11):22.DOI:10.11918/j.issn.0367-6234.201904056 |
|
摘要: |
高斯过程是一种函数的分布,在机器学习领域常用于回归.对于n个训练样本,其训练和预测时间复杂度分别为O(n3)和O(n2),因此难以应用于大规模数据.针对这个问题,本文基于分治的思想,提出一种简单高效的近似模型,称为“重叠局部高斯过程”.本文方法假设随机变量在给定邻近变量的值后,会与距离较远的变量条件独立.首先将训练样本集递归划分,构建一棵三叉树,其中兄弟节点包含的样本存在交集,交集中的样本起到诱导点的作用,可构建相邻区域的依赖关系.然后利用每个叶子结点所包含的样本建立局部的高斯过程回归模型,在当前假设下,每个父节点的边缘似然和预测分布可通过组合其子节点的计算结果来近似,从而降低计算量.同时,这种组合方式可保证拟合的函数是连续的.理论分析表明,对于n个训练样本,近似模型训练和预测的时间复杂度均为O(nt),其中t与交集的大小相关,通常介于1与2之间.此外通过在公共数据集上的实验对比也验证了本文近似模型的有效性. |
关键词: 模式识别 高斯过程 回归 贝叶斯学习 大规模机器学习 |
DOI:10.11918/j.issn.0367-6234.201904056 |
分类号:TP391.4 |
文献标识码:A |
基金项目:国家自然科学基金(51677042) |
|
Overlapped Local Gaussian Process Regression |
LIU Xiaofang2,LIU Ce1,LIU Lumi3,CHENG Dansong1
|
(1.School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China; 2.School of Electrical Engineering & Automation, Harbin Institute of Technology, Harbin 150001, China; 3.Beijing Institute of Control Engineering, Beijing 100000, China)
|
Abstract: |
Gaussian processes (GPs) are distributions of functions, and are commonly used for regression in machine learning community. For n training samples, their time complexity for training and prediction is O(n3) and O(n2) respectively. The high computational cost hinders their application to large datasets. Inspired by the divide and conquer strategy, the paper proposed a simple but efficient approximate model, called Overlapped Local Gaussian Process (OLGP). The method assumes that given the nearest neighbors, the random variables are independent of farther ones. The training samples are recursively divided and a ternary tree is constructed in the end. Sibling nodes have intersections where the samples play the role of inducing points to model the dependency between neighboring regions. Each leaf node is associated with a local GP regression model. The evidence and predictive distribution in parent nodes are composed of the ones from their sons, which reduces the computational cost significantly. Furthermore, the fitted function is guaranteed to be continuous. A theoretical analysis shows that the time complexity for training and prediction reduces to O(nt) for n training samples, where t depends on the proportion of the intersection in each level, and is usually between 1 and 2. The paper demonstrated the effectiveness of the method by speed-accuracy performance on several real-world datasets. |
Key words: pattern recognition gaussian process regression bayesian learning large scale machine learning |