Handling image data

In this section, we will read image data into R and explore it further to understand the various characteristics of image data. The code for reading and displaying images is as follows:

# Libraries
library(keras)
library(EBImage)

# Reading and plotting images
setwd("~/Desktop/image18")
temp = list.files(pattern="*.jpg")
mypic <- list()
for (i in 1:length(temp)) {mypic[[i]] <- readImage(temp[i])}
par(mfrow = c(3,6))
for (i in 1:length(temp)) plot(mypic[[i]])
par(mfrow = c(1,1))

As you can see from the preceding code, we will make use of the keras and EBImage libraries. The EBImage library is useful for handling and exploring image data. We will start by reading 18 JPEG image files that are stored in the image18 folder of my computer. These images each contain 6 pictures of bicycles, cars, and airplanes that were downloaded from the internet. These image files are read using the readImage function and are stored in mypic.

All 18 images are shown in the following screenshot:

From the preceding screenshot, we can see the six images of bicycles, cars, and airplanes. You might have noticed that not all of the pictures are of the same size. For example, the fifth and sixth bicycles noticeably vary in size. Similarly, the fourth and fifth airplanes are clearly of different sizes, too. Let's take a closer look at the data for the fifth bicycle using the following code:

# Exploring 5th image data
print(mypic[[5]])

OUTPUT
Image 
  colorMode    : Color 
  storage.mode : double 
  dim          : 299 169 3 
  frames.total : 3 
  frames.render: 1 

imageData(object)[1:5,1:6,1]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    1    1    1    1    1
[2,]    1    1    1    1    1    1
[3,]    1    1    1    1    1    1
[4,]    1    1    1    1    1    1
[5,]    1    1    1    1    1    1

hist(mypic[[5]])

Using the print function, we can look at how the image of a bicycle (unstructured data) has been converted into numbers (structured data). The dimensions for the fifth bicycle are 299 x 169 x 3, which leads to a total of 151,593 data points, or pixels, obtained by multiplying the three numbers. The first number, 299, represents the image width in terms of pixels and the second number, 169, represents the image height in terms of pixels. Note that a colored image consists of three channels representing the colors red, blue, and green. The small table extracted from the data shows the first five rows of data in the x-dimension, and the first six rows of data in the y-dimension, and the value for the z-dimension is one. Although all values in the body of the table are 1, they are expected to vary between 0 and 1.

A color image has red, green, and blue channels. A grayscale image has only one channel.

These data points for the fifth bicycle are used for creating a histogram, as shown in the following screenshot:

The preceding histogram shows the distribution of intensity values for the fifth image's data. It can be seen that most of the data points have high-intensity values for this image.

Let's now look at the following histogram of data based on the 16th image (that of an airplane) for comparison:

From the preceding histogram, we can see that this image has different intensity values for the red, green, and blue colors. In general, intensity values lie between zero and one. Data points that are closer to zero represent a darker color in the image and those closer to one indicate a brighter color in the image.

Let's take a look at data related to the 16th image, of an airplane, using the following code:

# Exploring 16th image data
print(mypic[[16]])

OUTPUT

Image 
 colorMode : Color 
 storage.mode : double 
 dim : 318 159 3 
 frames.total : 3 
 frames.render: 1 

imageData(object)[1:5,1:6,1]
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.2549020 0.2549020 0.2549020 0.2549020 0.2549020 0.2549020
[2,] 0.2549020 0.2549020 0.2549020 0.2549020 0.2549020 0.2549020
[3,] 0.2549020 0.2549020 0.2549020 0.2549020 0.2549020 0.2549020
[4,] 0.2588235 0.2588235 0.2588235 0.2588235 0.2588235 0.2588235
[5,] 0.2588235 0.2588235 0.2588235 0.2588235 0.2588235 0.2588235

From the output provided in the preceding code, we can see that the two images have different dimensions. The dimensions for the 16th image are 318 x 159 x 3, which results in a total of 151,686 data points or pixels.

In order to prepare this data for developing an image classification model, we will start by resizing all images to the same dimensions.

Table of Contents for Handling image data

Create new playlist

Sign In

Sign Up

Table of Contents for
Handling image data