Understanding the Perceptron Neural Network

What is a Perceptron?

A perceptron is a simple type of neuron in a neural network. Here’s what a perceptron does: it takes in several inputs (denoted as x₁, x₂, …, x_n) and produces a binary output, essentially making a decision.

The perceptron multiplies these inputs by corresponding weights (w₁, w₂, …, w_n), computes the weighted sum, and compares it to a threshold or bias (b). Based on the comparison, the perceptron outputs either a 0 or 1.

Mathematical Representation

The perceptron can be expressed as follows:

Inputs: x = [x₁, x₂, …, x_n]
Weights: w=[w₁, w₂, …, w_n]
Bias: b

The perceptron computes z = w ⋅ x + b, and applies the following rule:

If z ≤ 0, output = 0
If z > 0, output = 1

This simple algorithm acts as a linear classifier.

The Activation Function

The perceptron uses a step function (also called the Heaviside function) as its activation function. This function is defined as

A Practical Example: Going to the Movies

Let’s consider a decision-making scenario: whether to go to the movies based on three factors:

Weather
Whether you have company
Proximity to the theater

Assign weights to these factors:

Weather: w₁ = 4 (most important)
Company: w₂ = 2
Proximity: w₃ = 2

Set the bias to b = −5. The rule is to go to the movies only if the weather is good and at least one other factor is favorable.

Case 1: Bad weather (x₁ = 0), company (x₂ = 1), and close proximity (x₃ = 1):

z=(4 ⋅ 0) + (2 ⋅ 1) + (2 ⋅ 1) − 5 = 4 − 5 = −1

Since z ≤ 0, output = 0 (Do not go to the movies).

Case 2: Good weather (x₁ = 1), company (x₂ = 1), and close proximity (x₃ = 1):

z = (4 ⋅ 1) + (2 ⋅ 1) + (2 ⋅ 1) − 5 = 8 − 5 = 3

Since z > 0, output = 1 (Go to the movies).

Geometric Perspective

Consider a perceptron with two inputs (x₁ and x₂), weights (w₁ = −2, w₂ = −2), and bias (b = 3 ). The output a is determined as follows:

z = −2x₂ − 2x₂ +3

In the input space (x₁, x₂), the decision boundary is a straight line:

z = 0 ⟹ −2x₁ − 2x₂ + 3 = 0

Points on or to the right of the line produce z ≤ 0 (output = 0), while points to the left yield z > 0 (output = 1). Thus, the perceptron behaves as a linear classifier.

Binary Inputs

For binary inputs (x_i ∈ {0,1), there are four possible combinations:

x₁= 0, x₂ = 0
x₁ = 1, x₂= 0
x₁ = 0, x₂ = 1
x₁ = 1, x₂ = 1

The perceptron computes the output for each combination based on the weights and bias.

Let’s implement the perceptron in python from scratch:

import numpy as np

class Perceptron:
    def __init__(self, learning_rate=0.01, n_iters=1000):
        self.lr = learning_rate
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def activation_function(self, x):
        """
        Unit step function: returns 1 if x >= 0, else 0.
        """
        return np.where(x >= 0, 1, 0)

    def fit(self, X, y):
        """
        Train the perceptron model using the perceptron update rule.

        Parameters:
        X : np.array
            Training data of shape (n_samples, n_features).
        y : np.array
            Target labels of shape (n_samples,).
        """
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iters):
            for idx, x_i in enumerate(X):
                linear_output = np.dot(x_i, self.weights) + self.bias
                y_predicted = self.activation_function(linear_output)

                # Perceptron update rule
                update = self.lr * (y[idx] - y_predicted)
                self.weights += update * x_i
                self.bias += update

    def predict(self, X):
        """
        Predict the class labels for the input data X.

        Parameters:
        X : np.array
            Input data of shape (n_samples, n_features).

        Returns:
        np.array
            Predicted class labels of shape (n_samples,).
        """
        linear_output = np.dot(X, self.weights) + self.bias
        return self.activation_function(linear_output)

# Example usage
def main():
    # Example dataset (linearly separable)
    X = np.array([[1, 1], [2, 2], [3, 3], [1.5, 0.5], [3, 1], [4, 1]])
    y = np.array([0, 0, 0, 1, 1, 1])  # Binary labels

    # Initialize and train perceptron
    perceptron = Perceptron(learning_rate=0.1, n_iters=1000)
    perceptron.fit(X, y)

    # Test prediction
    predictions = perceptron.predict(X)
    print("Predictions:", predictions)

if __name__ == "__main__":
    main()

Conclusion:

The perceptron is a foundational concept in neural networks, demonstrating the principles of linear classification through a simple algorithm. While limited in handling non-linear problems, it lays the groundwork for more advanced neural network architectures and deep learning models. Understanding the perceptron is an essential step in grasping the fundamentals of machine learning.

Author

Naveen

Naveen Pandey has more than 2 years of experience in data science and machine learning. He is an experienced Machine Learning Engineer with a strong background in data analysis, natural language processing, and machine learning. Holding a Bachelor of Science in Information Technology from Sikkim Manipal University, he excels in leveraging cutting-edge technologies such as Large Language Models (LLMs), TensorFlow, PyTorch, and Hugging Face to develop innovative solutions.
View all posts