Introduction

Neural networks are a type of machine learning model that are designed to mimic the behavior of the human brain. They are used for a wide range of applications, from image and speech recognition to natural language processing and predictive analytics.

In this article, i will show you how to write a simple neural network, known as a perceptron, in Python using the NumPy library. This neural network will consist of an input layer, and an output layer, with a sigmoid activation function. It is the basis of all modern AI systems like chat-GPT, DALL-E, and co-pilot.

Step 1: Import NumPy

First, we need to import the NumPy library, which provides support for large, multi-dimensional arrays and matrices, along with a wide range of mathematical functions. This is the only library we need.

import numpy as np

Step 2: Define the sigmoid activation function

A visual representation of a sigmoid function

The sigmoid() function is a mathematical function that maps any input value to a value between 0 and 1, which is useful for modeling the behavior of neurons in a neural network.

An activation function is a crucial step in both artificial and biological neural networks. It allows neurons to do more than simply output the input they receive. Instead, the activation function works in a way that is similar to the rate at which action potentials fire in the brain.

# Sigmoid activation function

def sigmoid(x,deriv=False):
    if(deriv==True): 
        return x*(1-x)
    else: 
        return 1/(1+np.exp(-x))

Step 3: Define the neural network architecture

Diagram of a single layer perceptron

Our neural network will consist of an input layer with three nodes, and an output layer with one node. This is known as a single layer perceptron.

The specific size and number of layers in a neural network depend on factors such as problem complexity, dataset size, and available resources. A general rule of thumb is to match the input and output layers to the data. Hidden layers between the two will be covered in the next article.

Ultimately, the best architecture should be determined through experimentation to find the one that performs best for the specific problem.

input_layer = 3
output_layer = 1


# Input matrix, 4 entires each with 3 inputs
X = np.array([  [1,0,0],
                [0,0,1],
                [0,1,0],
                [1,0,1] ])
    
# Output set, 1 output per input entry            
y = np.array([  [0],
                [1],
                [0],
                [1]])

Step 4: Initialize the weights

The weights are the parameters that the neural network will learn during training. We will seed a numpy random number generator with np.random.seed(1), then initialize the weights randomly using the np.random.random() function.

np.random.seed(1)

# Initialize the array of weights randomly
W0 = np.random.random((input_layer,output_layer))

Step 5: Define the forward propagation function

The forward_propagate() propagation function computes the output of the perceptron for a given input. It does this by multiplying the input by the weights of the input layer, and applying the sigmoid activation function.

def forward_propagate(X):
    L0 = X
    L1 = sigmoid(np.dot(L0,W0))
    return L1

Step 6: Train the perceptron

The train() function passes the input thorough the forward propagation function. Then it performs a series of calculations known as backpropagation, which is used to update the weights of the network based on the error between the predicted output and the target output. Backpropagation consists of three steps
The layer one error is calculated by subtracting the predicted output (L1) from the target output (y).
The layer one delta, or the gradients of the layer one error with respect to the weights of each layer are then computed.
These deltas are then used to update the weights of the network in the direction of the gradient.

This process is repeated for a specified number of epochs, or until the error is minimized.

def train():
    global W0
    for iter in range(10000):

        # Forward propagation
        L0 = X
        L1 = sigmoid(np.dot(L0,W0))

        # Calculate the difference between the predicted output (L1) 
        #  and target output (y)
        L1_error = y - L1

        # Multiply how much we missed by the slope of the sigmoid 
        #  at the values in L1
        L1_delta = L1_error * sigmoid(L1,True)

        # Update weights using the delta
        W0 += np.dot(L0.T,L1_delta)

    print("Output After Training:")
    print(L1)

Step 7: Test the perceptron

To test our perceptron, we need to execute the train function, then we can pass a sample input through the forward_propagate() function and print the output.

train()

result = forward_propagate(np.array([[1,0,1]]))

print(np.round(result))

The result variable will be a value between 0 and 1, which represents the perceptorn’s prediction for the output. For clarity we have chosen to round the output to either 0 or 1 in the print() statement.

This test results in the output [[1.]], which is the correct output as per the training data.

The Code

import numpy as np

input_layer = 3
output_layer = 1

# Input matrix, 4 entires each with 3 inputs
X = np.array([  [1,0,0],
                [0,0,1],
                [0,1,0],
                [1,0,1] ])
    
# Output set, 1 output per input entry            
y = np.array([  [0],
                [1],
                [0],
                [1]])

# Sigmoid activation function
def sigmoid(x,deriv=False):
    if(deriv==True): 
        return x*(1-x)
    else: 
        return 1/(1+np.exp(-x))

np.random.seed(1)

# Initialize the array of weights randomly
W0 = np.random.random((input_layer,output_layer))

# Define our forward propogation function
def forward_propagate(X):
    L0 = X
    L1 = sigmoid(np.dot(L0,W0))
    return L1

def train():
    global W0
    for iter in range(10000):

        # Forward propagation
        L0 = X
        L1 = sigmoid(np.dot(L0,W0))

        # Calculate the difference between the predicted output (L1) 
        #  and target output (y)
        L1_error = y - L1

        # Multiply how much we missed by the slope of the sigmoid 
        #  at the values in L1
        L1_delta = L1_error * sigmoid(L1,True)

        # Update weights
        W0 += np.dot(L0.T,L1_delta)

    print("Output After Training:")
    print(L1)

train()    

result = np.round(forward_propagate(np.array([[1,0,1]])))

print(result)

Components

XInput data matrix.
yExpected output matrix or set.
W0Weights connecting layer 0 and layer 1.
L0Layer 0 – Input layer.
L1Layer 1 – Output layer.
L1_delta Layer 1 delta – This is the L1 error of the network scaled by the confidence.
L1_error Layer 1 error – subtracting the predicted output from the expected output.

Conclusion

In this article, we showed you how to write a simple perceptron in Python using NumPy. We started by importing the NumPy library, defining the sigmoid activation function, and defining the architecture of our neural network to be that of a single layer perceptron. We then initialized the weights randomly and defined the forward propagation function. Finally, we tested our perceptron by passing a sample input through the forward() function and printing the output.

This simple neural network can be used as a starting point for more complex models, with additional layers and more nodes, to solve a wide range of machine learning problems.

A python notebook containing more detail on this and other neural network architectures is available here and the interactive version linked here