Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), Clevert, Unterthiner, Hochreiter; 2015 - Summary
author: joshpapermaster
score: 8/10 / 10

Summary 1: FAST AND ACCURATE DEEP NETWORK LEARNING BY EXPONENTIAL LINEAR UNITS (ELUS)

This paper introduces an activation function called the exponential linear unit (ELU), which has faster learning properties than the previous most commonly used activation functions such as ReLU, LReLU, and PReLU.

ELU maintains the nice properties of ReLU by maintaining f(x) = x for x >= 0.

ELU expands on the improvements made by LReLU and PReLU by exponentially including negative inputs.

ELU

ELU

ELU performs well on the major image recognition sets they tested on. The authors proved ELU had significantly lower activation means and training loss than other activation functions when testing with the MNIST dataset. Thus, ELU comparatively converges faster during training. However, due to the non-linear activation, it was comparitively slower than ReLU and variants during testing. At the time of their evaluations and publication, ELU held the best test score of CIFAR-100 and second best of CIFAR-10. The CIFAR-100 test error used in their evaluation is 24.28%, which is soundly lower than the previous best of 27.62%.

Their evaluations proved ELU is adaptable to many different image recognition CNN’s. ELU without batch normalization performed comparatively better than ReLU with batch normalization. Batch normalization did not improve the performance of ELU in their testing.

TL;DR