High efficient activation function design for CNN model image classification task

Shengjie Du; Xiaofen Jia; Yourui Huang; Yongcun Guo; Baiting Zhao

doi:10.3788/IRLA20210253

Journals >Infrared and Laser Engineering >Volume 51 >Issue 3 >Page 20210253 > Article

Infrared and Laser Engineering
Vol. 51, Issue 3, 20210253 (2022)

High efficient activation function design for CNN model image classification task

Shengjie Du, Xiaofen Jia, Yourui Huang, Yongcun Guo, and Baiting Zhao

Author Affiliations

School of Electrical and Information Engineering, Anhui University of Science and Technolog, Huainan 232000, China

show less

DOI: 10.3788/IRLA20210253 Cite this Article

Shengjie Du, Xiaofen Jia, Yourui Huang, Yongcun Guo, Baiting Zhao. High efficient activation function design for CNN model image classification task[J]. Infrared and Laser Engineering, 2022, 51(3): 20210253 Copy Citation Text

show less

(a) f1, (b) f2, (c) f3,(d) f4 functions and images

Fig. 1. (a) f₁, (b) f₂, (c) f_3,(d) f₄ functions and images

Download full size | View in the Article

Derivatives of (a)f1, (b) f2,(c) f3,(d) f4 and their graphs

Fig. 2. Derivatives of (a)f₁, (b) f₂,(c) f₃,(d) f₄ and their graphs

Download full size | View in the Article

Fig. 3. Test accuracy (a) and training time (b) of different activation functions using ResNet18 network on CIFAR10

Download full size | View in the Article

Fig. 4. Test accuracy (a) and training time (b) of different activation functions using VGG16 network on CIFAR10

Download full size | View in the Article

Fig. 5. Test accuracy (a) and training time (b) of different activation functions using ResNet18 network on CIFAR100

Download full size | View in the Article

Fig. 6. Test accuracy (a) and training time (b) of different activation functions using VGG16 network on CIFAR100

Download full size | View in the Article

Fig. 7. Test accuracy (a) and training time (b) of different activation functions using ResNet18 network on Fer2013

Download full size | View in the Article

Fig. 8. Test accuracy (a) and training time (b) of different activation functions using VGG16 network on Fer2013

Download full size | View in the Article

Function	Function model
f1	${f_1}(x) = \left\{ {\begin{array}{*{20}{c}} {\;\;x\;,x \geqslant 0} \\ { - x,x < 0} \end{array}} \right.$
f2	${f_2}(x) = \left\{ {\begin{array}{*{20}{c} } {\quad \;\;x\;\;\;\;\;\,\;,x \geqslant 0} \\ { - \dfrac{2}{3}{ {( - x)}^{\frac{3}{2} } },x < 0} \end{array} } \right.$
f3	${f_3}(x) = \left\{ {\begin{array}{*{20}{c} } {\quad x\;\;,x \geqslant 0} \\ {\dfrac{x}{ {1 - x} },x < 0} \end{array} } \right.$
f4	${f_4}(x) = \left\{ {\begin{array}{*{20}{c}} {\;\;\;{\kern 1pt} {\kern 1pt} {\kern 1pt} x\quad ,x \geqslant 0} \\ { - {{\ln }^{1 - x}},x < 0} \end{array}} \right.$

Table 1. Mathematical models of four activation functions

View in the Article

Derived function	Function model
f1’	${f_1}^\prime (x) = \left\{ {\begin{array}{*{20}{c}} {\;\,1\;,x \geqslant 0} \\ { - 1,x < 0} \end{array}} \right.$
f2’	${f_2}^\prime (x) = \left\{ {\begin{array}{*{20}{c}} {\;\quad \;\;1\quad \;,x \geqslant 0} \\ { - \sqrt {( - x)} ,x < 0} \end{array}} \right.$
f3’	${f_3}^\prime (x) = \left\{ {\begin{array}{{20}{c} } 1 \\ {\dfrac{1}{ { { {(1 - x)}^2} } } } \end{array} } \right.\begin{array}{{20}{c} } {,x \geqslant 0} \\ {,x < 0} \end{array}$
f4’	${f_4}^\prime (x) = \left\{ {\begin{array}{*{20}{c} } {\;\;\;1\;\;{\kern 1pt} \;,x \geqslant 0} \\ {\dfrac{1}{ {1 - x} },x < 0} \end{array} } \right.$

Table 2. Four kinds of activation function derivative function model

View in the Article

Results Methods	Datasets
	CIFAR10		CIFAR100
	ACC	T/h	ACC	T/h
f₁	93.11%	1.332	74.82%	1.332
f₂	93.03%	1.335	74.27%	1.335
f₃	93.66%	1.290	75.23%	1.290
f₄	93.78%	1.262	75.87%	1.262
ReLU	92.90%	1.325	73.68%	1.325

Table 3. Performance of different activation functions on the ResNet18 network

View in the Article

Results Methods	Datasets
	CIFAR10		CIFAR100
	ACC	T/h	ACC	T/h
f₁	91.31%	1.225	58.91%	1.225
f₂	91.24%	1.248	58.35%	1.248
f₃	91.86%	1.243	59.23%	1.243
f₄	91.98%	1.175	59.95%	1.175
ReLU	91.15%	1.238	56.24%	1.238

Table 4. Performance of different activation functions on the VGG16 network

Download Citation

Save the article for my favorites

Paper Information