Layering

Once our RBM learns the structure of the input data, which is related to the activations made in our first hidden layer, the data gets passed down to the next hidden layer. The first hidden layer then becomes the new visible layer. The activations we created in the hidden layer now become our inputs. They will be multiplied by the weights in the new hidden layer to produce another set of activations. This process continues through all the hidden layers in our network. The hidden layer becomes the visible layer, we have another hidden layer whose weights we will use, and we repeat. Each new hidden layer results in adjusted weights, until we get to the point where we can recognize the input from the previous layer.

To elaborate just a bit more (helping you in your quest to remain buzzword-compliant), this is technically called unsupervised, greedy, layer-wise training. No input is required to improve the weights of each layer, which means no outside influence of any type is involved. This further means we should be able to use our algorithm to train on unsupervised data that has not been seen previously. As we have continually stressed, the more the data we have, the better our results! As each layer gets better and hopefully more accurate, we are in a much better position to increase our learning through each hidden layer, with the weights having the responsibility of guiding us to the correct image classification along the way.

But as we discuss reconstruction, we should point out that each time a number (weight) in our reconstruction effort is non-zero, that is an indication that our RBM has learned something from the data. In a sense, you can treat the returned numbers exactly as you would treat a percentage indicator. The higher the number, the more confident the algorithm is of what it is seeing. Remember, we have the master dataset that we are trying to get back to, and we have a reference dataset to use in our reconstruction efforts. As our RBM iterates over each image, it doesn't yet know what image it is dealing with; that's what it is trying to determine.

Let's take a brief moment to clarify something. When we say we are using a greedy algorithm, what we really mean is that our RBM will take the shortest path to achieve the best result. We will sample random pixels from the image we see, and test which ones lead us to the correct answer. The RBM will test each hypothesis against the master dataset (test set), which is our correct end goal. Keep in mind that each image is just a set of pixels we're trying to classify. Those pixels house features and characteristics of data. For example, a pixel can have different shades of light, wherein dark pixels perhaps indicate borders, light pixels perhaps indicate numbers, and so forth.

But what happens when things don't go our way? What happens if whatever we learn at any given step is not correct? Should this occur, it would mean that our algorithm has guessed incorrectly. Our course of action is then to go back and try again. This is not as bad, nor as time-consuming, as it may seem. Of course, there is a temporal cost associated with an incorrect hypothesis, but the end goal is that we must increase our learning efficiency and reduce our error with each stage. Each weighted connection that was wrong will be penalized like what we did in reinforcement learning. These connections will decrease in weight and no longer be as strong. Hopefully, the next pass through will increase our accuracy while decreasing our error, and the stronger the weight, the more the influence it will have.

So, let's take a hypothetical scenario and think aloud for a second. Let's say we are classifying numeric images, meaning numbers. Some images will have curves, such as 2, 3, 6, 8, 9, and so on. Other numbers, such as 1, 4 and 7, will not. Knowledge such as this is very important, because our RBM, will use it to continue to improve its learning and reduce error. If we think we're dealing with the number 2, then the weights to the path that indicate this to us will be more heavily weighted than others. This is a drastic oversimplification, but hopefully it's enough to help you understand what we are about to embark upon.

As we put all this together, we now have the theoretical framework for a Deep Belief Network. Although we have delved into more theory than other chapters, as you see our example program working, it will all start to make sense. And you will be much better prepared to use it in your applications, knowing what's happening behind the scenes. Remember, black hole versus black box!

To show you about both Deep Belief Networks and RBMs, we are going to use the fantastic open source software SharpRBM written by Mattia Fagerlund. This software is an incredible contribution to the open source community, and I have no doubt you will spend hours, if not days, working with it. This software comes with some very incredible demos. For this chapter, we will use the Letter Classification Demo.

The following screenshot is of our deep belief test application. Ever wonder what a computer dreams of when it's sleeping? Well my friend, you are about to find out!

As usual, we will also use ReflectInsight to provide us with a behind-the-scenes look into what is going on:

The first thing you will notice about our demo application is that there is a lot going on. Let's take a moment and break it down into smaller chunks.

In the upper-left corner of the program screen is the area where we designate the layer that we want to train. We have three hidden layers, all of which need proper training before testing. We can train each layer one at a time, starting with the first layer. You may train for as long or as little as you like, but the more you train, the better your system will be:

The next section following our training options is our progress. As we are training, all pertinent information, such as generation, reconstruction error, detector error, and learning rate, is displayed here:

The next section is the drawing of our feature detectors, which will update themselves throughout training if the Draw checkbox is checked:

As you begin training a layer, you will notice that the reconstructions and feature detectors are basically empty. They will refine themselves as your training progresses. Remember, we are reconstructing what we already know to be true! As the training continues, the reconstructed digits become more and more defined, along with our feature detector:

Here is a snapshot from the application during training. As you can see, it is on generation 31 and the reconstructed digits are very well defined. They are still not complete or correct, but you can see just how much progress we are making:

Table of Contents for Layering

Create new playlist

Sign In

Sign Up

Table of Contents for
Layering