Building a neural network from scratch

To perform the XOR gate operation, we build a simple two-layer neural network, as shown in the following diagram. As you can see, we have an input layer with two nodes: a hidden layer with five nodes and an output layer comprising one node:

We will understand step-by-step how a neural network learns the XOR logic:

  1. First, import the libraries:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
  1. Prepare the data as shown in the preceding XOR table:
X = np.array([ [0, 1], [1, 0], [1, 1],[0, 0] ])
y = p.array([ [1], [1], [0], [0]])
  1. Define the number of nodes in each layer:
num_input = 2
num_hidden = 5
num_output = 1
  1. Initialize the weights and bias randomly. First, we initialize the input to hidden layer weights:
Wxh = np.random.randn(num_input,num_hidden)
bh = np.zeros((1,num_hidden))
  1. Now, we initialize the hidden to output layer weights:
Why = np.random.randn (num_hidden,num_output)
by = np.zeros((1,num_output))
  1. Define the sigmoid activation function:
def sigmoid(z):
return 1 / (1+np.exp(-z))
  1. Define the derivative of the sigmoid function:
def sigmoid_derivative(z):
return np.exp(-z)/((1+np.exp(-z))**2)
  1. Define the forward propagation:
def forward_prop(X,Wxh,Why):
z1 = np.dot(X,Wxh) + bh
a1 = sigmoid(z1)
z2 = np.dot(a1,Why) + by
y_hat = sigmoid(z2)

return z1,a1,z2,y_hat
  1. Define the backward propagation:
def backword_prop(y_hat, z1, a1, z2):
delta2 = np.multiply(-(y-y_hat),sigmoid_derivative(z2))
dJ_dWhy = np.dot(a1.T, delta2)
delta1 = np.dot(delta2,Why.T)*sigmoid_derivative(z1)
dJ_dWxh = np.dot(X.T, delta1)

return dJ_dWxh, dJ_dWhy
  1. Define the cost function:
def cost_function(y, y_hat):
J = 0.5*sum((y-y_hat)**2)

return J
  1. Set the learning rate and the number of training iterations:
alpha = 0.01
num_iterations = 5000
  1. Now, let's start training the network with the following code:
cost =[]

for i in range(num_iterations):
z1,a1,z2,y_hat = forward_prop(X,Wxh,Why)
dJ_dWxh, dJ_dWhy = backword_prop(y_hat, z1, a1, z2)


#update weights
Wxh = Wxh -alpha * dJ_dWxh
Why = Why -alpha * dJ_dWhy


#compute cost
c = cost_function(y, y_hat)


cost.append(c)
  1. Plot the cost function:
plt.grid()
plt.plot(range(num_iteratins),cost)

plt.title('Cost Function')
plt.xlabel('Training Iterations')
plt.ylabel('Cost')

As you can observe in the following plot, the loss decreases over the training iterations:

Thus, in this chapter, we got an overall understanding of artificial neural network and how they learn.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset