The Tanh Activation function is a scaled and shifted version of the hyperbolic tangent function, a mathematical function frequently encountered in trigonometry and calculus. The Tanh function squashes input values within the range of -1 to 1, making it a useful choice for activation functions in neural networks. Defining the Tanh Function Mathematically The mathematical…
PReLU(Parametric ReLU) – PReLU is vital to the success of deep learning. It solves the problem with activation functions like sigmoid, where gradients would often vanish. This approach is finding more and more success in deep learning environments. But, we can still improve upon ReLU. Leaky ReLU was introduced, which does not zero out the…
What is an Activation Function? An activation function is a critical component in neural networks. It determines a neuron’s output after the neuron processes its inputs by computing a weighted sum. The activation function decides whether the neuron should be activated or not, introducing nonlinearity to the model. This nonlinearity enables the model to learn…
The activation function is a nonlinear function that takes in the weighted sum and produces the output. They are used to provide a more simplified model of neuron behavior which can be used as an input to deep neural networks. There are many different activation functions that can be used, including sigmoid, hyperbolic tangent, logistic,…
Vanishing and exploding gradient descent is a type of optimization algorithm used in deep learning. Vanishing Gradient Vanishing Gradient occurs when the gradient is smaller than expected. It causes the earlier layers to start degrading before the later ones do, causing a decrease in the overall learning rate of that subset of layers. The weights…
Forward propagation is a process in which the network’s weights are updated according to the input, output and gradient of the neural network. In order to update the weights, we need to find the input and output values. The input value is found by taking the difference between the current hidden-state value and that of…
Perceptrons are a type of artificial neural network that can be used for classification and regression. They are supervised learning algorithms, meaning they need labeled input data in order to learn. how to map inputs to outputs. What independent variables do perceptrons need? Perceptrons require at least one input and one output. What are the…
The perceptron is a type of artificial neural network (ANN) that is designed to recognize patterns in data. It can be used to identify objects, classify images, and detect changes in the environment. The perceptron was invented by Frank Rosenblatt in 1957 while he was working at Cornell Aeronautical Laboratory as part of a research…
In deep learning, L1 and L2 regularization are regularization techniques used to penalize the model’s weights during the training process. This penalty discourages the model from assigning excessive importance to certain features, thereby reducing the risk of overfitting. L1 Regularization L1 regularization, also known as Lasso regularization, adds a penalty proportional to the absolute value…
Introduction to LSTM (Long Short-Term Memory)Imagine you’re at a murder mystery dinner. At the very beginning, the Lord of the Manor suddenly collapses, and your task is to figure out, who done it? It could be the maid or the butler. However, there’s a problem: your short-term memory is not working. You can’t recall any…
Artificial intelligence (AI) is a term that encompasses computer systems designed to imitate human intelligence. It is an exciting field that has attracted considerable attention in many industries, including finance, hospitality, education and entertainment. Artificial intelligence is planned to simulate human behavior and thought processes, making it one of the most important trends of this…
Semi-supervised learning is a technique in between supervised and unsupervised learning. Arguably, it should not be a category of machine learning but only a generalization of supervised learning, but it’s useful to introduce the concept separately. Its aim is to reduce the cost of gathering labelled data by extending a few labels to similar unlabeled…