Introduction
Neural networks are a type of machine learning model that are designed to mimic the behavior of the human brain. They are used for a wide range of applications, from image and speech recognition to natural language processing and predictive analytics.
In this article, i will show you how to write a simple neural network, known as a perceptron, in Python using the NumPy library. This neural network will consist of an input layer, and an output layer, with a sigmoid activation function. It is the basis of all modern AI systems like chat-GPT, DALL-E, and co-pilot.
Step 1: Import NumPy
First, we need to import the NumPy library, which provides support for large, multi-dimensional arrays and matrices, along with a wide range of mathematical functions. This is the only library we need.
import numpy as np
Step 2: Define the sigmoid activation function
The sigmoid()
function is a mathematical function that maps any input value to a value between 0 and 1, which is useful for modeling the behavior of neurons in a neural network.
An activation function is a crucial step in both artificial and biological neural networks. It allows neurons to do more than simply output the input they receive. Instead, the activation function works in a way that is similar to the rate at which action potentials fire in the brain.
# Sigmoid activation function
def sigmoid(x,deriv=False):
if(deriv==True):
return x*(1-x)
else:
return 1/(1+np.exp(-x))
Step 3: Define the neural network architecture
Our neural network will consist of an input layer with three nodes, and an output layer with one node. This is known as a single layer perceptron.
The specific size and number of layers in a neural network depend on factors such as problem complexity, dataset size, and available resources. A general rule of thumb is to match the input and output layers to the data. Hidden layers between the two will be covered in the next article.
Ultimately, the best architecture should be determined through experimentation to find the one that performs best for the specific problem.
input_layer = 3
output_layer = 1
# Input matrix, 4 entires each with 3 inputs
X = np.array([ [1,0,0],
[0,0,1],
[0,1,0],
[1,0,1] ])
# Output set, 1 output per input entry
y = np.array([ [0],
[1],
[0],
[1]])
Step 4: Initialize the weights
The weights are the parameters that the neural network will learn during training. We will seed a numpy random number generator with np.random.seed(1)
, then initialize the weights randomly using the np.random.random()
function.
np.random.seed(1)
# Initialize the array of weights randomly
W0 = np.random.random((input_layer,output_layer))
Step 5: Define the forward propagation function
The forward_propagate()
propagation function computes the output of the perceptron for a given input. It does this by multiplying the input by the weights of the input layer, and applying the sigmoid
activation function.
def forward_propagate(X):
L0 = X
L1 = sigmoid(np.dot(L0,W0))
return L1
Step 6: Train the perceptron
The train()
function passes the input thorough the forward propagation function. Then it performs a series of calculations known as backpropagation, which is used to update the weights of the network based on the error between the predicted output and the target output. Backpropagation consists of three steps
The layer one error is calculated by subtracting the predicted output (L1
) from the target output (y
).
The layer one delta, or the gradients of the layer one error with respect to the weights of each layer are then computed.
These deltas are then used to update the weights of the network in the direction of the gradient.
This process is repeated for a specified number of epochs, or until the error is minimized.
def train():
global W0
for iter in range(10000):
# Forward propagation
L0 = X
L1 = sigmoid(np.dot(L0,W0))
# Calculate the difference between the predicted output (L1)
# and target output (y)
L1_error = y - L1
# Multiply how much we missed by the slope of the sigmoid
# at the values in L1
L1_delta = L1_error * sigmoid(L1,True)
# Update weights using the delta
W0 += np.dot(L0.T,L1_delta)
print("Output After Training:")
print(L1)
Step 7: Test the perceptron
To test our perceptron, we need to execute the train function, then we can pass a sample input through the forward_propagate()
function and print
the output.
train()
result = forward_propagate(np.array([[1,0,1]]))
print(np.round(result))
The result
variable will be a value between 0 and 1, which represents the perceptorn’s prediction for the output. For clarity we have chosen to round the output to either 0 or 1 in the print()
statement.
This test results in the output [[1.]], which is the correct output as per the training data.
The Code
import numpy as np
input_layer = 3
output_layer = 1
# Input matrix, 4 entires each with 3 inputs
X = np.array([ [1,0,0],
[0,0,1],
[0,1,0],
[1,0,1] ])
# Output set, 1 output per input entry
y = np.array([ [0],
[1],
[0],
[1]])
# Sigmoid activation function
def sigmoid(x,deriv=False):
if(deriv==True):
return x*(1-x)
else:
return 1/(1+np.exp(-x))
np.random.seed(1)
# Initialize the array of weights randomly
W0 = np.random.random((input_layer,output_layer))
# Define our forward propogation function
def forward_propagate(X):
L0 = X
L1 = sigmoid(np.dot(L0,W0))
return L1
def train():
global W0
for iter in range(10000):
# Forward propagation
L0 = X
L1 = sigmoid(np.dot(L0,W0))
# Calculate the difference between the predicted output (L1)
# and target output (y)
L1_error = y - L1
# Multiply how much we missed by the slope of the sigmoid
# at the values in L1
L1_delta = L1_error * sigmoid(L1,True)
# Update weights
W0 += np.dot(L0.T,L1_delta)
print("Output After Training:")
print(L1)
train()
result = np.round(forward_propagate(np.array([[1,0,1]])))
print(result)
Components
X | Input data matrix. |
y | Expected output matrix or set. |
W0 | Weights connecting layer 0 and layer 1. |
L0 | Layer 0 – Input layer. |
L1 | Layer 1 – Output layer. |
L1_delta | Layer 1 delta – This is the L1 error of the network scaled by the confidence. |
L1_error | Layer 1 error – subtracting the predicted output from the expected output. |
Conclusion
In this article, we showed you how to write a simple perceptron in Python using NumPy. We started by importing the NumPy library, defining the sigmoid activation function, and defining the architecture of our neural network to be that of a single layer perceptron. We then initialized the weights randomly and defined the forward propagation function. Finally, we tested our perceptron by passing a sample input through the forward()
function and printing the output.
This simple neural network can be used as a starting point for more complex models, with additional layers and more nodes, to solve a wide range of machine learning problems.
A python notebook containing more detail on this and other neural network architectures is available here and the interactive version linked here