In this section, we will build a VAE model so that we can reconstruct fashion MNIST images. Let's start by defining the network parameters of our VAE:
- First, we need to define some variables that will set the network parameters, batch size, input dimension, latent dimension, and the number of epochs:
# network parameters
batch_size <- 100L
input_dim <- 784L
latent_dim <- 2L
epochs <- 10
- Let's define the input layer and the hidden layer of the encoder part of the VAE:
# VAE input layer and hidden layer encoder
input <- layer_input(shape = c(input_dim))
x <- input %>% layer_dense(units = 256, activation = "relu")
- Now, we configure the dense layers that represent the mean and log of the standard deviation of the latent distribution:
# mean of latent distribution
z_mean <- x %>% layer_dense(units = latent_dim,name = "mean")
# log variance of latent distribution
z_log_sigma <- x %>% layer_dense(units = latent_dim,name = "sigma")
- Next, let's define a sampling function so that we can sample new points from the latent space:
# sampling
sampling <- function(arg) {
z_mean <- arg[, 1:(latent_dim)]
z_log_var <- arg[, (latent_dim + 1):(2 * latent_dim)]
epsilon <- k_random_normal(shape = list(k_shape(z_mean)[1], latent_dim),
mean = 0, stddev = 1)
z_mean + k_exp(z_log_sigma) * epsilon
}
- Now, we create a layer that takes the mean and standard deviation of the latent distribution and generates a random sample from it:
# random pont from latent distributiom
z <- layer_concatenate(list(z_mean, z_log_sigma)) %>% layer_lambda(sampling)
- So far, we have defined a layer to extract a random point. Now, we create some hidden layers for the decoder part of the VAE and combine them to create the output layer:
# VAE decoder hidden layers
x_1 <- layer_dense(units = 256, activation = "relu")
x_2 <- layer_dense(units = input_dim, activation = "sigmoid")
# decoder output
vae_output <- x_2(x_1(z))
- Next, we build a variational autoencoder and visualize its summary:
# variational autoencoder
vae <- keras_model(input, vae_output)
summary(vae)
The following screenshot shows the summary of the VAE model:
- Now, we create a separate encoder model:
# encoder, from inputs to latent space
encoder <- keras_model(input, c(z_mean,z_log_sigma))
summary(encoder)
The following screenshot shows the summary of the encoder model:
- Let's create an independent decoder model as well:
# Decoder input
decoder_input <- layer_input(k_int_shape(z)[-1])
# Decoder hidden layers
decoder_output <- x_2(x_1(decoder_input))
# Decoder
decoder <- keras_model(decoder_input,decoder_output)
summary(decoder)
The following screenshot shows the summary of the decoder model:
- Next, we define a custom loss function for the VAE:
# loss function
vae_loss <- function(x, decoded_output){
reconstruction_loss <- (input_dim/1.0)*loss_binary_crossentropy(x, decoded_output)
kl_loss <- -0.5*k_mean(1 + z_log_sigma - k_square(z_mean) - k_exp(z_log_sigma), axis = -1L)
reconstruction_loss + kl_loss
}
- Then, we compile and train the model:
# compile
vae %>% compile(optimizer = "rmsprop", loss = vae_loss)
Afterwards, we train the model:
# train
vae %>% fit(
x_train, x_train,
shuffle = TRUE,
epochs = epochs,
batch_size = batch_size,
validation_data = list(x_test, x_test)
)
- Now, let's have a look at some sample images that have been generated by the model:
random_distribution = array(rnorm(n = 20,mean = 0,sd = 4),dim = c(10,2))
predicted = array_reshape(predict(decoder,matrix(c(0,0),ncol=2)),dim = c(28,28))
for(i in seq(1,nrow(random_distribution))){
one_pred = predict(decoder,matrix(random_distribution[i,],ncol=2))
predicted = abind(predicted,array_reshape(one_pred,dim = c(28,28)),along = 2)
}
options(repr.plot.width=10, repr.plot.height=1)
grid.raster(predicted,interpolate=FALSE)
The following image shows the images that were generated after the 10th epoch:
In the next section, we'll go through a detailed explanation of the steps we implemented in this section.