多源非平衡交通检测数据的异常识别方法

邢雪; 于德新; 周户星; 田秀娟

期刊检索

关键词检索

新闻公告MORE

【03-25】投稿请提供保密审查证明
【05-04】论文版权转让协议
【07-05】出版伦理声明
【04-04】告作者书
【07-11】审稿人的职责
【11-26】《哈尔滨工业大学学报》入选中国科技期刊卓越行动计划领军期刊
【10-17】《哈工大学报》入选“第5届中国精品科技期刊”
【12-30】《哈工大学报》入选“世界学术影响力Q2期刊”
【01-03】《哈工大学报》入选“2018中国国际影响力优秀学术期刊”
【11-01】哈工大学报荣获2016、2018、2020年度“中国高校百佳科技期刊奖”
【03-24】哈工大学报10篇论文入选中国精品科技期刊顶尖学术论文
【12-05】哈工大学报2024优秀审稿专家
【12-18】哈工大学报2023优秀审稿专家
【12-24】哈工大学报2022优秀审稿专家
【12-21】哈工大学报2021优秀审稿专家
【12-10】哈工大学报2020优秀审稿专家

主管单位 中华人民共和国
工业和信息化部 主办单位 哈尔滨工业大学主编李隆球 国际刊号ISSN 0367-6234 国内刊号CN 23-1235/T

期刊网站二维码

微信公众号二维码

引用本文:	邢雪,于德新,周户星,田秀娟.多源非平衡交通检测数据的异常识别方法[J].哈尔滨工业大学学报,2019,51(9):165.DOI:10.11918/j.issn.0367-6234.201803092
	XING Xue,YU Dexin,ZHOU Huxing,TIAN Xiujuan.A method of abnormal data recognition of multi-source traffic with non-equilibrium feature[J].Journal of Harbin Institute of Technology,2019,51(9):165.DOI:10.11918/j.issn.0367-6234.201803092

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

过刊浏览高级检索

本文已被：浏览 1716次下载 1319次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
多源非平衡交通检测数据的异常识别方法
邢雪^1,2,于德新^2,3,周户星^2,3,田秀娟²
(1.吉林化工学院信息与控制工程学院,吉林吉林 132022;2.吉林大学交通学院,长春 130022; 3.吉林省智能交通工程研究中心,长春 13002)

摘要:

为保证交通检测数据的准确性并服务于实时的交通状态判别和预测,交通大数据采用多种检测源数据协同处理并利用机器学习的方法进行异常识别. 异常检测数据的识别主要基于机器学习中AdaBoost方法实现. 在算法的训练过程中,为消除单一检测源数据的离群现象,训练数据选取同一路段上多种检测源提供的数据集. 在算法的决策过程中,通过代价敏感方法的优势来改进AdaBoost的决策. 实验结果表明:基于非均衡特性改进的AdaBoost模型迫使分类器更加关注了待识别的异常样本,增强了AdaBoost决策过程中训练决策树规则的代表性,提高了异常类样本的分类准确率. 高速公路实例检测数据集验证了改进算法与相关经典算法的检测准确度、误检率、误警率等指标,其中改进模型与原模型相比,准确率提高了5.547%,误检率减低了6.792%. 多种算法的ROC曲线对比表明改进的AdaBoost方法筛选交通检测样本的可靠度更高,可有效调整由非平衡数据导致的分类误差.

关键词: AdaBoost 数据异常识别多源交通数据非平衡检测数据机器学习

DOI：10.11918/j.issn.0367-6234.201803092

分类号:U491.1

文献标识码:A

基金项目:国家科技支撑计划(2014BAG03B03)

A method of abnormal data recognition of multi-source traffic with non-equilibrium feature

XING Xue^1,2,YU Dexin^2,3,ZHOU Huxing^2,3,TIAN Xiujuan²

(1. College of Information and Control Engineering, Jilin Institute of Chemical Technology, Jilin 132022, Jilin, China; 2. Transportation College, Jilin University, Changchun 132002, China; 3. Jilin Engineering Research Center for Intelligent Transportation System, Changchun 132002, China)

Abstract:

The identification and prediction of real-time traffic conditions rely on data processing. Abnormal data recognition in traffic big data uses machine learning methods with multi-source traffic to ensure the accuracy of traffic detection data. The recognition of anomaly detection data is based on AdaBoost method in machine learning. To eliminate the outlier phenomenon of the single detection source data, the training dataset of the training process selected datasets provided by multiple detection sources on the same road section. The cost-sensitive method optimizes the decision-making process of the improved algorithm. Experimental results show that the improved AdaBoost model forced the classifier to pay more attention to abnormal class samples, which enhanced the representation of training decision tree rules in the AdaBoost and improved the classification accuracy of abnormal samples. The highway test dataset verified the detection accuracy, false detection rate, false alarm rate, and other indicators of the improved algorithm and related classical algorithms. The accuracy rate of the improved algorithm was increased by 5.547%, and the false detection rate was reduced by 6.792%. The comparison of ROC curves shows that the improved AdaBoost method is more reliable in identifying abnormal samples of traffic detection and can effectively adjust the classification error caused by non-equilibrium data.

Key words: AdaBoost abnormal data recognition multi-source traffic data non-equilibrium detection data machine learning

期刊检索

关键词检索

新闻公告MORE

友情链接LINKS