How to do it...

In this section, we will build a VAE model so that we can reconstruct fashion MNIST images. Let's start by defining the network parameters of our VAE:

  1. First, we need to define some variables that will set the network parameters, batch size, input dimension, latent dimension, and the number of epochs:
# network parameters
batch_size <- 100L
input_dim <- 784L
latent_dim <- 2L
epochs <- 10
  1. Let's define the input layer and the hidden layer of the encoder part of the VAE:
# VAE input layer and hidden layer encoder
input <- layer_input(shape = c(input_dim))
x <- input %>% layer_dense(units = 256, activation = "relu")
  1. Now, we configure the dense layers that represent the mean and log of the standard deviation of the latent distribution:
# mean of latent distribution
z_mean <- x %>% layer_dense(units = latent_dim,name = "mean")

# log variance of latent distribution
z_log_sigma <- x %>% layer_dense(units = latent_dim,name = "sigma")
  1. Next, let's define a sampling function so that we can sample new points from the latent space:
# sampling
sampling <- function(arg) {
z_mean <- arg[, 1:(latent_dim)]
z_log_var <- arg[, (latent_dim + 1):(2 * latent_dim)]
epsilon <- k_random_normal(shape = list(k_shape(z_mean)[1], latent_dim),
mean = 0, stddev = 1)
z_mean + k_exp(z_log_sigma) * epsilon
  1. Now, we create a layer that takes the mean and standard deviation of the latent distribution and generates a random sample from it:
# random pont from latent distributiom
z <- layer_concatenate(list(z_mean, z_log_sigma)) %>% layer_lambda(sampling)
  1. So far, we have defined a layer to extract a random point. Now, we create some hidden layers for the decoder part of the VAE and combine them to create the output layer:
# VAE decoder hidden layers
x_1 <- layer_dense(units = 256, activation = "relu")
x_2 <- layer_dense(units = input_dim, activation = "sigmoid")

# decoder output
vae_output <- x_2(x_1(z))
  1. Next, we build a variational autoencoder and visualize its summary:
# variational autoencoder
vae <- keras_model(input, vae_output)

The following screenshot shows the summary of the VAE model:

  1. Now, we create a separate encoder model:
# encoder, from inputs to latent space
encoder <- keras_model(input, c(z_mean,z_log_sigma))

The following screenshot shows the summary of the encoder model:

  1. Let's create an independent decoder model as well:
# Decoder input
decoder_input <- layer_input(k_int_shape(z)[-1])

# Decoder hidden layers
decoder_output <- x_2(x_1(decoder_input))
# Decoder
decoder <- keras_model(decoder_input,decoder_output)


The following screenshot shows the summary of the decoder model:

  1. Next, we define a custom loss function for the VAE:
# loss function
vae_loss <- function(x, decoded_output){
reconstruction_loss <- (input_dim/1.0)*loss_binary_crossentropy(x, decoded_output)
kl_loss <- -0.5*k_mean(1 + z_log_sigma - k_square(z_mean) - k_exp(z_log_sigma), axis = -1L)
reconstruction_loss + kl_loss
  1. Then, we compile and train the model:
# compile
vae %>% compile(optimizer = "rmsprop", loss = vae_loss)

Afterwards, we train the model:

# train
vae %>% fit(
x_train, x_train,
shuffle = TRUE,
epochs = epochs,
batch_size = batch_size,
validation_data = list(x_test, x_test)
  1. Now, let's have a look at some sample images that have been generated by the model:
random_distribution = array(rnorm(n = 20,mean = 0,sd = 4),dim = c(10,2))
predicted = array_reshape(predict(decoder,matrix(c(0,0),ncol=2)),dim = c(28,28))

for(i in seq(1,nrow(random_distribution))){
one_pred = predict(decoder,matrix(random_distribution[i,],ncol=2))
predicted = abind(predicted,array_reshape(one_pred,dim = c(28,28)),along = 2)

options(repr.plot.width=10, repr.plot.height=1)

The following image shows the images that were generated after the 10th epoch:

In the next section, we'll go through a detailed explanation of the steps we implemented in this section.

