What is PReLU and ELU activation function?
PReLU(Parametric ReLU) –
PReLU is vital to the success of deep learning. It solves the problem with activation functions like sigmoid, where gradients would often vanish. This approach is finding more and more success in deep learning environments. But, we can still improve upon ReLU. Leaky ReLU was introduced, which does not zero out the negative inputs as ReLU does. Instead, it multiplies the negative input by a small value (like 0.02) and keeps the positive input as is. This has shown a negligible increase in the accuracy of our models.
One of the disadvantages of ReLU is the fact that it might not be able to adapt well to sudden changes, because of its slope. This is where PReLU comes in – it can learn the slope parameter using backpropagation and prevents this problem.
Feed-forward networks only need to learn one slope parameter for each layer. In Convolutional Neural Networks, either the slope parameters can be learned for each layer or Compared to the number of weights & biases that need to be learned, this is relatively insignificant.
ELU(Exponential LU) –
ELU activation functions are more computationally expensive than PReLU activation functions. They have a shallower slope, which can be advantageous for certain types of networks.
Exponential Linear Units are used to increase (make closer to zero) the mean activation of each layer. An alpha constant value is an important parameter which needs to be a positive number.
The ELU algorithm has been shown to provide more accurate results than ReLU and also converges faster. ELU and ReLU are both the same for positive input values, but for negative input values ELU smoothly “eases” down to 0.0 (i.e., -alpha) whereas ReLU sharply drops to 0.5 (-half of alpha).