A.3. Deep learning

A deep neural network is simply a composition of multiple layers of simpler functions called layers. Each layer function consists of a matrix multiplication followed by a nonlinear activation function. The most common activation function is f(x) = max(0,x), which returns 0 if x is negative or returns x otherwise.

A simple neural network might be

Read this diagram from left to right as if data flows in from the left into the L1 function then the L2 function and becomes the output on the right. The symbols k, m, and n refer to the dimensionality of the vectors. A k-length vector is input to function L1, which produces an m-length vector that then gets passed to L2, which finally produces an n-dimensional vector.

Now let’s look at what each of these L functions are doing.

A neural network layer, generically, consists of two parts: a matrix multiplication and an activation function. An n-length vector comes in from the left and gets multiplied by a matrix (often called a parameter or weight matrix), which may change the dimensionality of the resulting output vector. The output vector, now of length m, gets passed through a nonlinear activation function, which does not change the dimensionality of the vector.

A deep neural network just stacks these layers together, and we train it by applying gradient descent on the weight matrices, which are the parameters of the neural network. Here’s a simple 2-layer neural network in Numpy.

Listing A.2. A simple neural network
def nn(x,w1,w2):
    l1 = x @ w1                1
    l1 = np.maximum(0,l1)      2
    l2 = l1 @ w2
    l2 = np.maximum(0,l2)
    return l2
 
w1 = np.random.randn(784,200)  3
w2 = np.random.randn(200,10)
x = np.random.randn(784)       4
nn(x,w1,w2)
 
array([326.24915523,   0.        ,   0.        , 301.0265272 ,
       188.47784869,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ])

  • 1 Matrix multiplication
  • 2 Nonlinear activation function
  • 3 Weight (parameter) matrix, initialized randomly
  • 4 Random input vector

In the next section you’ll learn how to use the PyTorch library to automatically compute gradients to easily train neural networks.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset