Creating your own autoencoder

Now that you are an expert on autoencoders, let's move on to less theory and more practice. Let's take a bit of a different route on this one. Instead of using an open-source package and showing you how to use it, let's write our own autoencoder framework that you can enhance to make your own. We'll discuss and implement the basic pieces needed, and then write some sample code showing how to use it. We will make this chapter unique in that we won't finish the usage sample; we'll do just enough to get you started along your own path to autoencoder creation. With that in mind, let's begin.

Let's start off by thinking about what an autoencoder is and what things we would want to include. First off, we're going to need to keep track of the number of layers that we have. These layers will be Restricted Boltzmann Machines for sure. Just so you know, we'll also refer to Restricted Boltzmann Machines as RBMs from time to time, where brevity is required.

So, we know we'll need to track the number of layers that our autoencoder has. We're also going to need to monitor the weights we'll need to use: learning rate, recognition weights, and generative weights. Training data is important, of course, as are errors. I think for now that should be it. Let's block out a class to do just this.

Let's start with an interface, which we will use to calculate errors. We will only need one method, which will calculate the error for us. The RBM will be responsible for doing this, but we'll get to that later:

public interface IErrorObserver
{
void CalculateError(double PError);
}

Before we define our RBM class, we'll need to look at the layers that it will use. To best represent this, we'll create an abstract class. We'll need to track the state of the layer, the bias used, the amount of bias change, the activity itself, and how many neurons it will have. Rather than distinguish between mirror and canonical neurons, we'll simply represent all neuron types as one single object. We also will need to have multiple types of RBM layer. Gaussian and binary are two that come to mind, so the following will be the base class for those layers:

public abstract class RestrictedBoltzmannMachineLayer
{
protected double[] state;
protected double[] bias;
protected double[] biasChange;
protected double[] activity;
protected int numNeurons = 0;
}

We must keep in mind that our RBM will need to track its weights. Since weights are applied through layers with a thing called a synapse, we'll create a class to represent all we want to do with weights. Since we'll need to track the weights, their changes, and the pre- and post-size, let's just create a class that encapsulates all of that:

public class RestrictedBoltzmannMachineWeightSet
{
private int preSize;
private int postSize;
private double[][] weights;
private double[][] weightChanges;
}

Next, as our learning rate encompasses features such as weights, biases, and momentum, we will be best served if we create a separate class to represent all of this:

public struct RestrictedBoltzmannMachineLearningRate
{
internal double weights;
internal double biases;
internal double momentumWeights;
internal double momentumBiases;
}

Finally, let's create a class that encompasses our training data:

public struct TrainingData
{
public double[] posVisible;
public double[] posHidden;
public double[] negVisible;
public double[] negHidden;
}

With all of this now defined, let's go ahead and work on our RestrictedBoltzmannMachine class. For this class, we'll need to keep track of how many visible and hidden layers we have, the weights and learning rate we will use, and our training data:

public class RestrictedBoltzmannMachine
{
private RestrictedBoltzmannMachineLayer visibleLayers;
private RestrictedBoltzmannMachineLayer hiddenLayers;
private RestrictedBoltzmannMachineWeightSet weights;
private RestrictedBoltzmannMachineLearningRate learnrate;
private TrainingData trainingdata;
private int numVisibleLayers;
private int numHiddenLayers;
}

And, finally, with everything else in place, let's create our Autoencoder class:

public class Autoencoder
{
private int numlayers;
private bool pretraining = true;
private RestrictedBoltzmannMachineLayer[] layers;
private AutoencoderLearningRate learnrate;
private AutoencoderWeights recognitionweights;
private AutoencoderWeights generativeweights;
private TrainingData[] trainingdata;
private List<IErrorObserver> errorobservers;
}

Even though we know there will be a lot more required for some of these classes, this is the basic framework that we need to get started framing in the rest of the code. To do that, we should think about some things.

Since weights are a prominent aspect of our autoencoder, we are going to have to use and initialize weights quite often. But how should we initialize our weights, and with what values? We will provide two distinct choices. We will either initialize all weights to zero, or use a Gaussian. We will also have to initialize the biases as well. Let's go ahead and create an interface from which to do this so it will make it easier later to select the type of initialization we want (zero or Gaussian):

public interface IWeightInitializer
{
double InitializeWeight();
double InitializeBias();}

We mentioned earlier that we needed to have multiple types of RBM layer to use. Gaussian and binary were two that came to mind. We have already created the interface for that, so let's go ahead and put our base classes into the form, as we will need them shortly. To do this, we will need to expand our RBM layer class and add two abstract methods so that they can be cloned, and so that we can set the state of the layer:

public abstract void SetState(int PWhich, double PInput);
public abstract object Clone();

Our RestrictedBoltzmannMachineLayer class now looks like this:

public abstract class RestrictedBoltzmannMachineLayer
{
protected double[] state;
protected double[] bias;
protected double[] biasChange;
protected double[] activity;
protected int numNeurons = 0;
public abstract void SetState(int PWhich, double PInput);
public abstract object Clone();
}

With our very basic autoencoder in place, we should now turn our attention to how we will build our autoencoder. Let's try to keep things as modular as possible and, with that in mind, let's create an AutoEncoderBuilder class that we can have encapsulate things such as weight initialization, adding layers, and so forth. It will look something like the following:

public class AutoencoderBuilder
{
private List<RestrictedBoltzmannMachineLayer> layers = new List<RestrictedBoltzmannMachineLayer>();
private AutoencoderLearningRate learnrate = new AutoencoderLearningRate();
private IWeightInitializer weightinitializer = new GaussianWeightInitializer();
}

Now that we have this class blocked in, let's begin to add some meat to it in the form of functions. We know that when we build an autoencoder we are going to need to add layers. We can do that with this function. We will pass it the layer, and then update our internal learning-rate layer:

private void AddLayer(RestrictedBoltzmannMachineLayer PLayer)
{
learnrate.preLearningRateBiases.Add(0.001);
learnrate.preMomentumBiases.Add(0.5);
learnrate.fineLearningRateBiases.Add(0.001);
if (layers.Count >= 1)
{
learnrate.preLearningRateWeights.Add(0.001);
learnrate.preMomentumWeights.Add(0.5);
learnrate.fineLearningRateWeights.Add(0.001);
}
layers.Add(PLayer);
}

Once we have this base function, we can then add some higher-level functions, which will make it easier for us to add layers to our autoencoder:

public void AddBinaryLayer (int size)
{
AddLayer (new RestrictedBoltzmannMachineBinaryLayer(size));
}
public void AddGaussianLayer (int size)
{
AddLayer (new RestrictedBoltzmannMachineGaussianLayer(size));
}

Finally, let's add a Build() method to our autoencoder builder to make it easy to build:

public Autoencoder Build()
{
return new Autoencoder(layers, learnrate, weightinitializer);
}

Now let's turn our attention to our autoencoder itself. We are going to need a function to help us initialize our biases:

private void InitializeBiases(IWeightInitializer PWInitializer)
{
for (int i = 0; i < numlayers; i++)
{
for (int j = 0; j < layers[i].Count; j++)
{
layers[i].SetBias(j, PWInitializer.InitializeBias());
}
}
}

Next, we are going to need to initialize our training data. This will basically involve creating all the arrays that we need and setting their initial values to zero as follows:

private void InitializeTrainingData()
{
trainingdata = new TrainingData[numlayers - 1];
for (inti = 0; i < numlayers - 1; i++)
{
trainingdata[i].posVisible = new double[layers[i].Count];
Utility.SetArrayToZero(trainingdata[i].posVisible);
trainingdata[i].posHidden = new double[layers[i + 1].Count];
Utility.SetArrayToZero(trainingdata[i].posHidden);
trainingdata[i].negVisible = new double[layers[i].Count];
Utility.SetArrayToZero(trainingdata[i].negVisible);
trainingdata[i].negHidden = new double[layers[i + 1].Count];
Utility.SetArrayToZero(trainingdata[i].negHidden);
}
}

With that behind us, we're off to a good start. Let's start to use the software and see what we're missing. Let's create our builder object, add some binary and Gaussian layers, and see how it looks:

AutoencoderBuilder builder = new AutoencoderBuilder();
builder.AddBinaryLayer(4);
builder.AddBinaryLayer(3);

builder.AddGaussianLayer(3);
builder.AddGaussianLayer(1);

Not bad, right? So, what's next? Well, we've got our autoencoder created and have added layers. We now lack functions to allow us to fine tune and train learning rates and momentum. Let's see how they would look if we were to add them here as follows:

builder.SetFineTuningLearningRateBiases(0, 1.0);
builder.SetFineTuningLearningRateWeights(0, 1.0);
builder.SetPreTrainingLearningRateBiases(0, 1.0);
builder.SetPreTrainingLearningRateWeights(0, 1.0);
builder.SetPreTrainingMomentumBiases(0, 0.1);
builder.SetPreTrainingMomentumWeights(0, .05);

That looks about right. At this point, we should add these functions into our autoencoderbuilder object so we can use them. Let's see how that would look. Remember that with our builder object we automatically created our learning rate object, so now we just have to use it to populate things such as our weights and biases, along with the momentum weights and biases:

public void SetPreTrainingLearningRateWeights(int PWhich, double PLR)
{
learnrate.preLearningRateWeights[PWhich] = PLR;
}
public void SetPreTrainingLearningRateBiases(int PWhich, double PLR)
{
learnrate.preLearningRateBiases[PWhich] = PLR;
}
public void SetPreTrainingMomentumWeights(int PWhich, double PMom)
{
learnrate.preMomentumWeights[PWhich] = PMom;
}
public void SetPreTrainingMomentumBiases(int PWhich, double PMom)
{
learnrate.preMomentumBiases[PWhich] = PMom;
}
public void SetFineTuningLearningRateWeights(int PWhich, double PLR)
{
learnrate.fineLearningRateWeights[PWhich] = PLR;
}
public void SetFineTuningLearningRateBiases(int PWhich, double PLR)
{
learnrate.fineLearningRateBiases[PWhich] = PLR;
}

Well, let's now stop and take a look at what our sample program is turning out to look like:

AutoencoderBuilder builder = new AutoencoderBuilder();
builder.AddBinaryLayer(4);
builder.AddBinaryLayer(3);
builder.AddGaussianLayer(3);
builder.AddGaussianLayer(1);
builder.SetFineTuningLearningRateBiases(0, 1.0);
builder.SetFineTuningLearningRateWeights(0, 1.0);
builder.SetPreTrainingLearningRateBiases(0, 1.0);
builder.SetPreTrainingLearningRateWeights(0, 1.0);
builder.SetPreTrainingMomentumBiases(0, 0.1);
builder.SetPreTrainingMomentumWeights(0, .05);

Not bad. All we should need to do now is to call our Build() method on our builder and we should have the first version of our framework:

Autoencoder encoder = builder.Build();

With all this now complete,and looking back at the preceding code, I think at some point we are going to need to be able to gain access to our individual layers; what do you think? Just in case, we'd better provide a function to do that. Let's see how that would look:

RestrictedBoltzmannMachineLayer layer = encoder.GetLayer(0);
RestrictedBoltzmannMachineLayer layerHidden = encoder.GetLayer(1);

Since our internal layers are RestrictedBoltzmannMachine layers, that is the type that we should be returning, as you can see from the previous code. The GetLayer() function needs to reside inside our autoencoder object, though, not the builder. So, let's go ahead and add it now. We'll need to be good developers and make sure that we have a bounds check to ensure that we are passing a valid layer index before we try to use it. We'll store all those neat little utility functions in a class of their own, and we might as well call it Utility, since the name makes sense. I won't go into how we can code that function, as I am fairly confident that every reader already knows how to do bounds checks, so you can either make up your own or look at the accompanying source code to see how it's done in this instance:

public RestrictedBoltzmannMachineLayer GetLayer(int PWhichLayer)
{
Utility.WithinBounds("Layer index out of bounds!", PWhichLayer, numlayers);
return layers[PWhichLayer];
}

OK, so we can now create our autoencoders, set weights and biases, and gain access to individual layers. I think the next thing we need to start thinking about is training and testing. We'll need to take each separately, of course, so why don't we start with training?

We will need to be able to train our RBM, so why don't we create an object dedicated to doing this. We'll call it, no surprise here, RestrictedBoltzmannMachineTrainer. Again, we are going to need to deal with our LearningRate, object, and weight sets, so let's make sure we add them as variables right away:

public static class RestrictedBoltzmannMachineTrainer
{
private static RestrictedBoltzmannMachineLearningRate learnrate;
private static RestrictedBoltzmannMachineWeightSet weightset;
}

Now, what functions do you think we will need for our trainer? Obviously, a Train() method is required; otherwise, we named our object incorrectly. We'll also need to train our weights and layer biases:

private static void TrainWeight(int PWhichVis, int PWhichHid, double PTrainAmount);
private static void TrainBias(RestrictedBoltzmannMachineLayer PLayer, int PWhich, double PPosPhase, double PNegPhase);

Last, but not least, we should probably have a helper function that lets us know the training amount, which for us will involve taking the positive visible amount times the positive hidden amount and subtracting that from the negative visible amount times the negative hidden amount:

private static double CalculateTrainAmount(double PPosVis, double PPosHid, double PNegVis, double PNegHid)
{
return ((PPosVis * PPosHid) - (PNegVis * PNegHid));
}

OK, let's see where our program stands:

AutoencoderBuilder builder = new AutoencoderBuilder();
builder.AddBinaryLayer(4);
builder.AddBinaryLayer(3);
builder.AddGaussianLayer(3);
builder.AddGaussianLayer(1);
builder.SetFineTuningLearningRateBiases(0, 1.0);
builder.SetFineTuningLearningRateWeights(0, 1.0);
builder.SetPreTrainingLearningRateBiases(0, 1.0);
builder.SetPreTrainingLearningRateWeights(0, 1.0);
builder.SetPreTrainingMomentumBiases(0, 0.1);
builder.SetPreTrainingMomentumWeights(0, .05);
Autoencoder encoder = builder.Build();
RestrictedBoltzmannMachineLayer layer = encoder.GetLayer(0);
RestrictedBoltzmannMachineLayer layerHidden = encoder.GetLayer(1);

Nice. Can you see how it's all starting to come together? Now it's time to consider how we are going to add data to our network. Before we do any kind of training on the network, we will need to load data. How will we do this? Let's consider the notion of pre-training. This is the act of loading data into the network manually before we train it. What would this function look like in the context of our program? How about something such as this?

encoder.PreTrain(0, new double[] {0.1, .05, .03, 0.8});

We would just need to tell our autoencoder which layer we want to populate with data, and then supply the data. That should work for us. If we did this, then the following is how our program would evolve:

AutoencoderBuilder builder = new AutoencoderBuilder();
builder.AddBinaryLayer(4);
builder.AddBinaryLayer(3);
builder.AddGaussianLayer(3);
builder.AddGaussianLayer(1);
builder.SetFineTuningLearningRateBiases(0, 1.0);
builder.SetFineTuningLearningRateWeights(0, 1.0);
builder.SetPreTrainingLearningRateBiases(0, 1.0);
builder.SetPreTrainingLearningRateWeights(0, 1.0);
builder.SetPreTrainingMomentumBiases(0, 0.1);
builder.SetPreTrainingMomentumWeights(0, .05);
Autoencoder encoder = builder.Build();
RestrictedBoltzmannMachineLayer layer = encoder.GetLayer(0);
RestrictedBoltzmannMachineLayer layerHidden = encoder.GetLayer(1);
encoder.PreTrain(0, new double[] {0.1, .05, .03, 0.8});
encoder.PreTrain(1, new double[] { 0.1, .05, .03, 0.9 });
encoder.PreTrain(2, new double[] { 0.1, .05, .03, 0.1 });
encoder.PreTrainingComplete();

What do you think so far? With this code, we would be able to populate three layers with data. I threw in an extra function, PreTrainingComplete, as a nice way to let our program know that we have finished pre-training. Now, let's figure out how those functions come together.

For pretraining, we will do this in batches. We can have from 1 to n number of batches. In many cases, the number of batches will be just 1. Once we determine the number of batches we want to use, we will iterate through each batch of data.

For each batch of data, we will process the data and determine whether our neurons were activated. We then set the layer state based upon that. We will move both forward and backward through the network, setting our states. Using the following diagram, we will move forward through layers like this Y -> V -> W -> (Z):

Once activations are set, we must perform the actual pre-training. We do this in the pre-synaptic layer, starting at layer 0. When we pre-train, we call our trainer object's Train method, which we created earlier and then pass the layer(s) and the training data, our recognition weights, and learning rate. To do this, we will need to create our actual function, which we will call PerformPreTraining(). The following is what this code would look like:

private void PerformPreTraining(int PPreSynapticLayer)
{
RestrictedBoltzmannMachineLearningRate sentlearnrate = new RestrictedBoltzmannMachineLearningRate(learnrate.preLearningRateWeights[PPreSynapticLayer],learnrate.preLearningRateBiases[PPreSynapticLayer],learnrate.preMomentumWeights[PPreSynapticLayer],learnrate.preMomentumBiases[PPreSynapticLayer]);RestrictedBoltzmannMachineTrainer.Train(layers[PPreSynapticLayer], layers[PPreSynapticLayer + 1],trainingdata[PPreSynapticLayer], sentlearnrate, recognitionweights.GetWeightSet(PPreSynapticLayer));
}

Once pre-training is complete, we now will need to calculate the error rate based upon the positive and negative visible data properties. That will complete our pretraining function, and our sample program will now look as follows:

AutoencoderBuilder builder = new AutoencoderBuilder();
builder.AddBinaryLayer(4);
builder.AddBinaryLayer(3);
builder.AddGaussianLayer(3);
builder.AddGaussianLayer(1);
builder.SetFineTuningLearningRateBiases(0, 1.0);
builder.SetFineTuningLearningRateWeights(0, 1.0);
builder.SetPreTrainingLearningRateBiases(0, 1.0);
builder.SetPreTrainingLearningRateWeights(0, 1.0);
builder.SetPreTrainingMomentumBiases(0, 0.1);
builder.SetPreTrainingMomentumWeights(0, .05);
Autoencoder encoder = builder.Build();
RestrictedBoltzmannMachineLayer layer = encoder.GetLayer(0);
RestrictedBoltzmannMachineLayer layerHidden = encoder.GetLayer(1);
encoder.PreTrain(0, new double[] {0.1, .05, .03, 0.8});
encoder.PreTrain(1, new double[] { 0.1, .05, .03, 0.9 });
encoder.PreTrain(2, new double[] { 0.1, .05, .03, 0.1 });
encoder.PreTrainingComplete();

With all this code behind us, all we need to do now is to save the autoencoder, and we should be all set. We will do this by creating a Save() function in the autoencoder, and call it as follows:

encoder.Save("testencoder.txt");

To implement this function, let's look at what we need to do. First, we need a filename to use for the autoencoder name. Once we open a .NET TextWriter object, we then save the learning rates, the recognition weights, and generative weights. Next, we iterate through all the layers, write out the layer type, and then save the data. If you decide to implement more types of RBM layers than we created, make sure that you in turn update the Save() and Load() methods so that your new layer data is saved and re-loaded correctly.

Let's look at our Save function:

public void Save(string PFilename)
{
TextWriter file = new StreamWriter(PFilename);
learnrate.Save(file);
recognitionweights.Save(file);
generativeweights.Save(file);
file.WriteLine(numlayers);
for (inti = 0; i < numlayers; i++)
{
if(layers[i].GetType() == typeof(RestrictedBoltzmannMachineGaussianLayer))
{
file.WriteLine("RestrictedBoltzmannMachineGaussianLayer");
}
else if (layers[i].GetType() == typeof(RestrictedBoltzmannMachineBinaryLayer))
{
file.WriteLine("RestrictedBoltzmannMachineBinaryLayer");
}
layers[i].Save(file);
}
file.WriteLine(pretraining);
file.Close();
}

With our autoencoder saved to disk, we now should really deal with the ability to reload that data into memory and create an autoencoder from it. So, we'll now need a Load() function. We'll need to basically follow the steps we did to write our autoencoder to disk but, this time, we'll read them in, instead of writing them out. Our weights, learning rate, and layers will have also a Load() function, just like each of the preceding items had a Save() function.

Our Load() function will be a bit different in its declaration. Since we are loading in a saved autoencoder, we have to assume that, at the time this call is made, an autoencoder object has not yet been created. Therefore, we will make this function static() on the autoencoder object itself, as it will return a newly created autoencoder for us. Here's how our function will look:

public static Autoencoder Load(string PFilename)
{
TextReader file = new StreamReader(PFilename);
Autoencoder retval = new Autoencoder();
retval.learnrate = new AutoencoderLearningRate();
retval.learnrate.Load(file);
retval.recognitionweights = new AutoencoderWeights();
retval.recognitionweights.Load(file);
retval.generativeweights = new AutoencoderWeights();
retval.generativeweights.Load(file);
retval.numlayers = int.Parse(file.ReadLine());
retval.layers = new RestrictedBoltzmannMachineLayer[retval.numlayers];
for (inti = 0; i < retval.numlayers; i++)
{
string type = file.ReadLine();
if (type == "RestrictedBoltzmannMachineGaussianLayer")
{
retval.layers[i] = new RestrictedBoltzmannMachineGaussianLayer();
}
else if (type == "RestrictedBoltzmannMachineBinaryLayer")
{
retval.layers[i] = new RestrictedBoltzmannMachineBinaryLayer();
}
retval.layers[i].Load(file);
}
retval.pretraining = bool.Parse(file.ReadLine());
retval.InitializeTrainingData();
retval.errorobservers = new List<IErrorObserver>();
file.Close();
return retval;
}

With that done, let's see how we would call our Load() function. It should be like the following:

Autoencoder newAutoencoder = Autoencoder.Load("testencoder.txt");

So, let's stop here and take a look at all we've accomplished. Let's see what our program can do, as follows:

AutoencoderBuilder builder = new AutoencoderBuilder();
builder.AddBinaryLayer(4);
builder.AddBinaryLayer(3);
builder.AddGaussianLayer(3);
builder.AddGaussianLayer(1);
builder.SetFineTuningLearningRateBiases(0, 1.0);
builder.SetFineTuningLearningRateWeights(0, 1.0);
builder.SetPreTrainingLearningRateBiases(0, 1.0);
builder.SetPreTrainingLearningRateWeights(0, 1.0);
builder.SetPreTrainingMomentumBiases(0, 0.1);
builder.SetPreTrainingMomentumWeights(0, .05);
Autoencoder encoder = builder.Build();
RestrictedBoltzmannMachineLayer layer = encoder.GetLayer(0);
RestrictedBoltzmannMachineLayer layerHidden = encoder.GetLayer(1);
encoder.PreTrain(0, new double[] {0.1, .05, .03, 0.8});
encoder.PreTrain(1, new double[] { 0.1, .05, .03, 0.9 });
encoder.PreTrain(2, new double[] { 0.1, .05, .03, 0.1 });
encoder.PreTrainingComplete();
encoder.Save("testencoder.txt");
Autoencoder newAutoencoder = Autoencoder.Load("testencoder.txt");

Table of Contents for Creating your own autoencoder

Create new playlist

Sign In

Sign Up

Table of Contents for
Creating your own autoencoder