What is Vanishing and exploding gradient descent?
Vanishing and exploding gradient descent is a type of optimization algorithm used in deep learning.
Vanishing Gradient occurs when the gradient is smaller than expected. It causes the earlier layers to start degrading before the later ones do, causing a decrease in the overall learning rate of that subset of layers. The weights of the previous nodes will update until they’re no longer significant.
The vanishing gradient is a problem in neural networks that occurs because the values for the weights of the network and the derivatives of these weights are very small, which leads to a steep learning rate.
Exploding gradient can happen when a gradient is too big and creates an unstable model. This means the weight of the model will be large and they may appear as “NaN” (see if you don’t believe me). This can be solved using the dimensionality reduction technique which helps to minimize complexity.