How to do it...

Let's start by declaring a few variables that will be required for the model's configuration:

First, we define the image's size in terms of height, width, and the number of channels. Since we are doing our analysis on colored images, we keep the number of channels to 3, meaning RGB mode. We also define the shape of the latent space vectors:

latent_dim <- 32
height <- 32
width <- 32
channels <- 3

Next, we create the generator network. The generator network maps random vectors of shape latent_dim to images of the input size, which in our case is (32, 32, 3):

input_generator <- layer_input(shape = c(latent_dim))

output_generator <- input_generator %>% 
# We transform the input data into a 16x16 128-channels feature map initially
 layer_dense(units = 128 * 16 * 16) %>%
 layer_activation_leaky_relu() %>% 
 layer_reshape(target_shape = c(16, 16, 128)) %>% 
# Next ,we add a convolution layer
 layer_conv_2d(filters = 256, kernel_size = 5, 
 padding = "same") %>% 
 layer_activation_leaky_relu() %>% 
# Now we upsample the data to 32x32 dimension using the layer_conv_2d_transpose()
 layer_conv_2d_transpose(filters = 256, kernel_size = 4, 
 strides = 2, padding = "same") %>% 
 layer_activation_leaky_relu() %>%
# Now we add more convolutional layers to the network
 layer_conv_2d(filters = 256, kernel_size = 5, 
 padding = "same") %>% 
 layer_activation_leaky_relu() %>% 
 layer_conv_2d(filters = 256, kernel_size = 5, 
 padding = "same") %>% 
 layer_activation_leaky_relu() %>% 
# Produce a 32x32 1-channel feature map
 layer_conv_2d(filters = channels, kernel_size = 7,
 activation = "tanh", padding = "same")

generator <- keras_model(input_generator, output_generator)

Let's look at the summary of the generator network:

summary(generator)

The following screenshot shows the description of the generator model:

Now, we create the discriminator network. This network maps images produced by the generator of shape (32, 32, 3) to a binary value and estimates the probability of the generated image being real or fake:

input_discriminator <- layer_input(shape = c(height, width, channels))

output_discriminator <- input_discriminator %>% 
 layer_conv_2d(filters = 128, kernel_size = 3) %>% 
 layer_activation_leaky_relu() %>% 
 layer_conv_2d(filters = 128, kernel_size = 4, strides = 2) %>% 
 layer_activation_leaky_relu() %>% 
 layer_conv_2d(filters = 128, kernel_size = 4, strides = 2) %>% 
 layer_activation_leaky_relu() %>% 
 layer_conv_2d(filters = 128, kernel_size = 4, strides = 2) %>% 
 layer_activation_leaky_relu() %>% 
 layer_flatten() %>%
 # One dropout layer
 layer_dropout(rate = 0.3) %>% 
 # Classification layer
 layer_dense(units = 1, activation = "sigmoid")

discriminator <- keras_model(input_discriminator, output_discriminator)

Let's look at the summary of the discriminator network:

summary(discriminator)

The following screenshot shows the description of the discriminator model:

After configuring the discriminator network, we compile it. We use rmsprop as the optimizer and binary_crossentropy as the loss function. The learning rate is specified as 0.0008. We use clipvalue for gradient clipping, which limits the magnitude of the gradient so that it behaves better in the proximity of steep cliffs:

discriminator %>% compile(
 optimizer = optimizer_rmsprop(lr = 0.0008,clipvalue = 1.0,decay = 1e-8),
 loss = "binary_crossentropy"
)

Before we start training the GAN network, we freeze the weights of the discriminator to make it non-trainable:

freeze_weights(discriminator)

Let's configure the DCGAN network and compile it. A GAN network combines the generator and the discriminator networks:

gan_input <- layer_input(shape = c(latent_dim),name = 'dc_gan_input')
gan_output <- discriminator(generator(gan_input))
gan <- keras_model(gan_input, gan_output)

gan %>% compile(
 optimizer = optimizer_rmsprop(lr = 0.0004,clipvalue = 1.0,decay = 1e-8), 
 loss = "binary_crossentropy"
)

Let's take a look at the summary of our GAN model:

summary(gan)

The following screenshot shows the description of the GAN model:

Now, let's start training the network. We train our DCGAN network for 2,000 iterations on a batch of 40 new images for each iteration. We create a folder named dcgan_images and store the generated images for various iterations in that folder. We also store the models at different iterations in another folder named dcgan_model:

iterations <- 2000
batch_size <- 40
dir.create("dcgan_images")
dir.create("dcgan_model")

Now, we train our GAN model:

start_index <- 1

for (i in 1:iterations) {
 
# Sample random points in the normally distributed latent space of dimension :
# (batch_size *latent_dimension)

 random_latent_vectors <- matrix(rnorm(batch_size * latent_dim), 
 nrow = batch_size, ncol = latent_dim)
 
# Use generator network to decode the above random points to fake images
 generated_images <- generator %>% predict(random_latent_vectors)
 
# Combine the fake images with real images to build the training data for discriminator
 stop_index <- start_index + batch_size - 1 
 real_images <- training_data[start_index:stop_index,,,]
 rows <- nrow(real_images)
 combined_images <- array(0, dim = c(rows * 2, dim(real_images)[-1]))
 combined_images[1:rows,,,] <- generated_images
 combined_images[(rows+1):(rows*2),,,] <- real_images
 
 # Provide appropriate labels for real and fake images
 labels <- rbind(matrix(1, nrow = batch_size, ncol = 1),
 matrix(0, nrow = batch_size, ncol = 1))
 
 # Adds random noise to the labels to increase robustness of the discriminator
 labels <- labels + (0.5 * array(runif(prod(dim(labels))),
 dim = dim(labels)))
 
 # Train the discriminator using both real and fake images
 discriminator_loss <- discriminator %>% train_on_batch(combined_images, labels) 
 
 # Sample random points in the latent space
 random_latent_vectors <- matrix(rnorm(batch_size * latent_dim), 
 nrow = batch_size, ncol = latent_dim)
 
 # Assembles labels that say "all real images"
 misleading_targets <- array(0, dim = c(batch_size, 1))
 
 # Train the generator by using the gan model,note that the discriminator weights are frozen.
 gan_model_loss <- gan %>% train_on_batch( 
 random_latent_vectors, 
 misleading_targets
 ) 
 
 start_index <- start_index + batch_size
 if (start_index > (nrow(training_data) - batch_size))
 start_index <- 1
 
# At few iterations save the model and save generated images
if(i %in% c(5,10,15,20,40,100,200,500,800,1000,1500,2000)){
 
# Save models
 save_model_hdf5(gan,paste0("dcgan_model/gan_model_",i,".h5"))

# Save generated images
 generated_images <- generated_images *255
 generated_images = array_reshape(generated_images ,dim = c(batch_size,32,32,3))
 generated_images = (generated_images -min(generated_images ))/(max(generated_images )-min(generated_images ))
 grid = generated_images [1,,,]
 for(j in seq(2,5)){
 single = generated_images [j,,,]
 grid = abind(grid,single,along = 2)
 }
 png(file=paste0("dcgan_images/generated_flowers_",i,".png"),
 width=600, height=350)
 grid.raster(grid, interpolate=FALSE)
 dev.off()
 }
}

After 2,000 iterations, the generated images look as follows:

If we wish to enhance the accuracy of the model, we can train it for more iterations.

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...