一种新的深度卷积神经网络的SLU函数

doi:10.11918/j.issn.0367-6234.201703117

首页 > 过刊浏览>2018年第50卷第4期 >117-123. DOI:10.11918/j.issn.0367-6234.201703117

一种新的深度卷积神经网络的SLU函数
DOI:
                        10.11918/j.issn.0367-6234.201703117
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:(空军工程大学 防空反导学院,西安 710051)
作者简介:赵慧珍(1990—), 女, 博士研究生; 刘付显(1962—), 男, 教授, 博士生导师
通讯作者:赵慧珍, happy100zhao90@163.com
中图分类号:TP183
基金项目:国家自然科学基金(61601499)

A novel softplus linear unit for deep CNN

Author:

Affiliation:

(School of Air and Missile Defense, Air Force Engineering University, Xi’an 710051, China)

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

修正线性单元(rectified linear unit, ReLU)是深度卷积神经网络常用的激活函数, 但当输入为负数时, ReLU的输出为零, 造成了零梯度问题; 且当输入为正数时, ReLU的输出保持输入不变, 使得ReLU函数的平均值恒大于零, 引起了偏移现象, 从而限制了深度卷积神经网络的学习速率和学习效果.针对ReLU函数的零梯度问题和偏移现象, 根据“输出均值接近零的激活函数能够提升神经网络学习性能”原理对其进行改进, 提出SLU(softplus linear unit)函数.首先, 对负数输入部分进行softplus处理, 使得负数输入时SLU函数的输出为负, 从而输出平均值更接近于零, 减缓了偏移现象; 其次, 为保证梯度平稳, 对SLU的参数进行约束, 并固定正数部分的参数; 最后, 根据SLU对正数部分的处理调整负数部分的参数, 确保激活函数在零点处连续可导, 信息得以双向传播.设计深度自编码模型在数据集MINST上进行无监督学习, 设计网中网卷积神经网络模型在数据集CIFAR-10上进行监督学习.实验结果表明, 与ReLU及其相关改进单元相比, 基于SLU函数的神经网络模型具有更好的特征学习能力和更高的学习精度.

Abstract:

Currently, the most popular activation function for deep convolutional neural network is the rectified linear unit (ReLU).The ReLU activation function outputs zero for negative quadrant, inducing the death of some neurons, and remains the input data for the positive quadrant, inducing a bias shift.According to the theory that "zero means activations improving learning ability", softplus linear unit(SLU) is introduced as an adaptive activation function that can tackle with these two problems.Firstly, negative inputs are processed with the softplus function, pushing the mean of outputs of the activation function to zero and reducing the bias shift.Then, the parameters of the positive component are fixed to control vanishing gradients.Thirdly, to maintain continuity and differentiability at zero, the parameters of the negative part are updated according to the positive quadrant.Several experiments are conducted on the MNIST dataset for supervised learning with deep auto-encode networks, as well as several experiments on the CIFAR-10 dataset for unsupervised learning with deep convolutional neural networks.The experiments have shown faster convergence and better performance for image classification of SLU-based networks compared with rectified activation functions.

参考文献

相似文献

引证文献

引用本文

赵慧珍,刘付显,李龙跃.一种新的深度卷积神经网络的SLU函数[J].哈尔滨工业大学学报,2018,50(4):117. DOI:10.11918/j. issn.0367-6234.201703117

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2017-03-23
最后修改日期:
录用日期:
在线发布日期: 2018-05-08
出版日期:

出版声明

期刊订阅

引用本文

相关视频

分享

文章指标

历史

文章二维码