Building a neural network from scratch

To perform the XOR gate operation, we build a simple two-layer neural network, as shown in the following diagram. As you can see, we have an input layer with two nodes: a hidden layer with five nodes and an output layer comprising one node:

We will understand step-by-step how a neural network learns the XOR logic:

  1. First, import the libraries:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
  1. Prepare the data as shown in the preceding XOR table:
X = np.array([ [0, 1], [1, 0], [1, 1],[0, 0] ])
y = p.array([ [1], [1], [0], [0]])
  1. Define the number of nodes in each layer:
num_input = 2
num_hidden = 5
num_output = 1
  1. Initialize the weights and bias randomly. First, we initialize the input to hidden layer weights:
Wxh = np.random.randn(num_input,num_hidden)
bh = np.zeros((1,num_hidden))
  1. Now, we initialize the hidden to output layer weights:
Why = np.random.randn (num_hidden,num_output)
by = np.zeros((1,num_output))
  1. Define the sigmoid activation function:
def sigmoid(z):
return 1 / (1+np.exp(-z))
  1. Define the derivative of the sigmoid function:
def sigmoid_derivative(z):
return np.exp(-z)/((1+np.exp(-z))**2)
  1. Define the forward propagation:
def forward_prop(X,Wxh,Why):
z1 =,Wxh) + bh
a1 = sigmoid(z1)
z2 =,Why) + by
y_hat = sigmoid(z2)

return z1,a1,z2,y_hat
  1. Define the backward propagation:
def backword_prop(y_hat, z1, a1, z2):
delta2 = np.multiply(-(y-y_hat),sigmoid_derivative(z2))
dJ_dWhy =, delta2)
delta1 =,Why.T)*sigmoid_derivative(z1)
dJ_dWxh =, delta1)

return dJ_dWxh, dJ_dWhy
  1. Define the cost function:
def cost_function(y, y_hat):
J = 0.5*sum((y-y_hat)**2)

return J
  1. Set the learning rate and the number of training iterations:
alpha = 0.01
num_iterations = 5000
  1. Now, let's start training the network with the following code:
cost =[]

for i in range(num_iterations):
z1,a1,z2,y_hat = forward_prop(X,Wxh,Why)
dJ_dWxh, dJ_dWhy = backword_prop(y_hat, z1, a1, z2)

#update weights
Wxh = Wxh -alpha * dJ_dWxh
Why = Why -alpha * dJ_dWhy

#compute cost
c = cost_function(y, y_hat)

  1. Plot the cost function:

plt.title('Cost Function')
plt.xlabel('Training Iterations')

As you can observe in the following plot, the loss decreases over the training iterations:

Thus, in this chapter, we got an overall understanding of artificial neural network and how they learn.

