Combining Multi-scale Directed Depth Motion Maps and Log-Gabor Filters for Human Action Recognition

Xiaoye Zhao; Xunsheng Ji; Yuanxiang Li; Li Peng

Please submit manuscripts in either of the following two submission systems

ScholarOne Manuscripts

ScholarOne

勤云稿件系统

Search by Issue

Search by Keywords

News & AnnouncementMORE

【03-29】2015 Outstanding Reviewers
【03-27】2014 Outstanding Reviewers
【02-18】2013 Outstanding Reviewers
【12-29】The First Outstanding Reviewers
【05-04】Copyright Transfer Agreement
【04-04】To authors

Supervised by Ministry of Industry and Information Technology of The People's Republic of China Sponsored by Harbin Institute of Technology Editor-in-chief Yu Zhou ISSNISSN 1005-9113 CNCN 23-1378/T

期刊网站二维码

微信公众号二维码

Related citation:

Xiaoye Zhao,Xunsheng Ji,Yuanxiang Li,Li Peng.Combining Multi-scale Directed Depth Motion Maps and Log-Gabor Filters for Human Action Recognition[J].Journal of Harbin Institute Of Technology(New Series),2019,26(4):89-96.DOI:10.11916/j.issn.1005-9113.17090.

【Print】【HTML】【PDF download】【View/Add Comment】【Download reader】【 Close 】

←Previous|Next→

Back Issue Advanced Search

This paper has been: browsed 857times downloaded 712times	码上扫一扫！
Shared by: Wechat More Font:larger+\|default\|smaller-
Combining Multi-scale Directed Depth Motion Maps and Log-Gabor Filters for Human Action Recognition

Author Name	Affiliation
Xiaoye Zhao	Engineering Research Center of Internet of Things Technology Applications of the Ministry of Education, School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, Jiangsu, China
Xunsheng Ji	Engineering Research Center of Internet of Things Technology Applications of the Ministry of Education, School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, Jiangsu, China
Yuanxiang Li	Engineering Research Center of Internet of Things Technology Applications of the Ministry of Education, School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, Jiangsu, China
Li Peng	Engineering Research Center of Internet of Things Technology Applications of the Ministry of Education, School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, Jiangsu, China

Abstract:

Recognition of the human actions by computer vision has become an active research area in recent years. Due to the speed and the high similarity of the actions, the current algorithms cannot get high recognition rate. A new recognition method of the human action is proposed with the multi-scale directed depth motion maps (MsdDMMs) and Log-Gabor filters. According to the difference between the speed and time order of an action, MsdDMMs is proposed under the energy framework. Meanwhile, Log-Gabor is utilized to describe the texture details of MsdDMMs for the motion characteristics. It can easily satisfy both the texture characterization and the visual features of human eye. Furthermore, the collaborative representation is employed as action recognition by the classification. Experimental results show that the proposed algorithm, which is applied in the MSRAction3D dataset and MSRGesture3D dataset, can achieve the accuracy of 95.79% and 96.43% respectively. It also has higher accuracy than the existing algorithms, such as super normal vector(SNV), hierarchical recurrent neural network (Hierarchical RNN).

Key words: human action recognition depth motion maps Log-Gabor filters collaborative representation based classifier

DOI：10.11916/j.issn.1005-9113.17090

Clc Number:TP391.4

Fund:

Descriptions in Chinese:

结合多尺度有向深度运动图和Log-Gabor滤波器的人体行为识别

赵晓叶，吉训生，李元祥，彭力

（江南大学物联网工程学院物联网应用技术教育部工程中心，江苏无锡214122）

创新点说明：

1）考虑到动作执行速度的不同，本文提出了一种新的基于能量均分的视频分割方法，传统方法对第n层金字塔进行了等分，其中，第n-1层金字塔的细节信息完全可以在第n层金字塔中体现出来。因此，为了在金字塔的不同层更大限度的包含细节信息，本文在第n层金字塔进行了等分，构建多尺度深度运动图。

2）在行为识别中，除了身体形状和运动信息外，运动方向也至关重要。考虑到动作执行时，时间顺序的不同，本文提出了有向深度运动图。有向DMM分为正向DMM（Positive DMM，PDMM）、反向DMM（Negative DMM，NDMM），前者反映的是当前帧图像的深度值比前一帧图像的深度值大的形状和运动信息，后者反映的是当前帧图像的深度值比前一帧图像的深度值小的形状和运动信息，相似但时间排序相反的两个动作的PDMM和NDMM正好是相反的，因此基于PDMM和NDMM表示可以区别两个动作。最后综合得到基于能量的多尺度有向深度运动图。

3）为描述多尺度有向深度运动图纹理细节，本文采用在纹理表征方面具有优势同时符合人眼视觉特性的Log-Gabor作为特征表示。

针对上述新的算法进行了实验验证，且对参数设置进行了大量对比实验，得到对应的最佳参数。结果表明, 本文算法准确率分别达95.79%和96.43%，与现存许多算法相比，有更高的识别率、鲁棒性。

研究目的：

DMM是基于整个深度序列得到的，丢失了人体行为本身的时间信息，对于动作相似但时间顺序不同的两个人体动作，是很难区分的，比如“坐下”和“站起”。另外，DMMs并没有考虑到动作执行速度差异造成的类内误差，从而降低识别率。本文的目的就是在尽量满足实时性的前提下，提高动作的识别率。

研究方法：

研究方法：主要是使用MATLAB进行仿真实验。在公开动作识别库MSRAction3D和手势识别库MSRGesture3D上进行实验验证。最后将本文识别率与其他现有算法识别率进行对比，结果表明本文具有更高的识别率，分别达到95.79%和96.43%。另外混淆矩阵也显示了在两个数据库中各个动作识别的情况，从而进一步显示本文方法有效地减少了相似动作的误判率。另外，针对参数选择，本文也进行了对比试验。

实验设置：动作识别库MSRAction3D：一个包含20种动作，由10个表演者对每个动作重复2~3次得到的人体行为公共数据库，共557个视频序列。该数据库许多动作高度相似，具有很大挑战性。为了便于性能比较，本文将20个动作作为一个集合，在10个表演者中选择第奇数个的数据作为训练集，第偶数个的数据作为测试集。在实验中，正面、侧面、顶面的MsdDMM尺寸分别归一化为102*54，102*75和75*54，Log-Gabor滤波器尺寸设置为10*11，CRC中的正则化参数λ设置为0.001。

手势识别库MSRGesture3D：是一个包含12个由美国标准手语定义的动态手势，由10个表演者对每个动作重复2~3次得到的人体手势测试评价数据库，共333个视频序列，该数据库存在很多自遮挡问题。本文采用Leave one-subject-out交叉验证方法，总共进行10次实验。第n次实验使用第n个表演者的所有数据作为测试集，其余表演者的数据作为训练集，最终取10次实验结果的平均值作为最终识别率。

结果：

1）在公开动作识别库MSRAction3D和手势识别库MSRGesture3D上，识别率可分别达到95.79%和96.43%，与现存许多算法相比，有更高识别率。

2）在公开动作识别库MSRAction3D上， =0.001，特征子为Log-Gabor时，取得最高识别率95.79%。

3）在手势识别库MSRGesture3D上， =0.01，特征子为Log-Gabor时，取得最高识别率96.43%。

结论：

MSRAction3D数据集：

取不同值时的识别率

λ	0.0001	0.001	0.01	0.1	1
Accuracy(%)	95.41	95.79	95.05	95.05	94.34

采用不同特征描述子时的识别率

operators	HOG	LBP	Gabor	Log-Gabor
Accuracy(%)	92.22	94.35	94.70	95.79

混淆矩阵

MSRGesture3D数据集：

取不同值时的识别率

λ	0.0001	0.001	0.01	0.1	1
Accuracy(%)	94.60	96.32	96.43	95.88	93.98

采用不同特征描述子时的识别率

operators	HOG	LBP	Gabor	Log-Gabor
Accuracy(%)	93.60	94.70	95.44	96.43

混淆矩阵

本文提出了一种结合基于能量的MsdDMM和Log-Gabor滤波器的人体行为识别方法算法。该算法首先提出一种考虑动作执行速度和时间顺序的基于能量的MsdDMM表示；然后，提取Log-Gabor纹理特征作为动作的特征描述子刻画MsdDMM的细节信息；最后，使用CRC进行动作识别。实验表明：本文算法与现存许多算法相比，人体行为识别上都有更高的识别率、鲁棒性。

关键词：人体行为识别；深度运动图；Log-Gabor滤波器；协同分类器

Search by Issue

Search by Keywords

News & AnnouncementMORE

LINKS