We are going to demonstrate how to do image processing using Python's libraries such as NumPy and SciPy.
In scientific computing, images are usually seen as n-dimensional arrays. They are usually two-dimensional arrays; in our examples, they are represented as a NumPy array data structure. Therefore, functions and operations performed on those structures are seen as matrix operations.
Images in this sense are not always two-dimensional. For medical or bio-sciences, images are data structures of higher dimensions such as 3D (having the z axis as depth or as the time axis) or 4D (having three spatial dimensions and a temporal one as the fourth dimension). We will not be using those in this recipe.
We can import images using various techniques; they all depend on what you want to do with image. Also, it depends on the larger ecosystem of tools you are using and the platform you are running your project on.
In this recipe, we will demonstrate several ways to use image processing in Python, mainly related to scientific processing and less on the artistic side of image manipulation.
In some examples in this recipe, we use the SciPy library, which you have already installed if you have installed NumPy. If you haven't, it is easily installable using your OS's package manager by executing the following command:
$ sudo apt-get install python-scipy
For Windows users, we recommend using prepackaged Python environments like EPD, which we discussed in Chapter 1, Preparing Your Working Environment.
If you want to install these using official source distributions, make sure you have installed system dependencies, such as:
libblas
and liblapack
gcc
and gfortran
Whoever has worked in the field of digital signal processing or even attended a university course on this or a related subject must have come across Lena's image, the de facto standard image used for demonstrating image processing algorithms.
SciPy contains this image already packed inside the misc
. module, so it is really simple for us to reuse that image. This is how you can read and show this image:
import scipy.misc import matplotlib.pyplot as plt # load already prepared ndarray from scipy lena = scipy.misc.lena() # set the default colormap to gray plt.gray() plt.imshow(lena) plt.colorbar() plt.show()
This should open a new window with a figure displaying Lena's image in gray tones and axes. The color bar shows a range of values in the figure; here it shows 0—black to 255—white.
Further, we could examine this object with the following code:
print lena.shape print lena.max() print lena.dtype
The output for the preceding code is shown here:
(512, 512) 245 dtype('int32')
We see the following features in the image 512 points wide and 512 points high
We could also read in an image using Python Imaging Library (PIL), which we installed in Chapter 1, Preparing Your Working Environment.
Here is the code to do that:
import numpy import Image import matplotlib.pyplot as plt bug = Image.open('stinkbug.png') arr = numpy.array(bug.getdata(), numpy.uint8).reshape(bug.size[1], bug.size[0], 3) plt.gray() plt.imshow(arr) plt.colorbar() plt.show()
We should see something similar to Lena's image as shown in the following table:
This is useful if we are already tapping into an existing system that uses PIL as their default image loader.
Other than just loading the images, what we really want to do is use Python to manipulate images and process them. For example, we want to be able to load a real image that consists of RGB channels, convert that into one channel ndarray
, and later use array slicing to also zoom in to the part of the image. Here's the code to demonstrate how we are able to use NumPy and matplotlib to do that:
import matplotlib.pyplot as plt import scipy import numpy bug = scipy.misc.imread('stinkbug1.png') # if you want to inspect the shape of the loaded image # uncomment following line #print bug.shape # the original image is RGB having values for all three # channels separately. For simplicity, we convert that to greyscale image # by picking up just one channel. # convert to gray bug = bug[:,:,0]
bug[:,:,0]
is called
array slicing. This NumPy feature allows us to select any part of the multidimensional array. For example, let's see a one-dimensional array:
>>> a = array(5, 1, 2, 3, 4) >>> a[2:3] array([2]) >>> a[:2] array([5, 1]) >>> a[3:] array([3, 4])
For multidimensional arrays, we separate each dimension with a comma (,
) as shown here:
>>> b = array([[1,1,1],[2,2,2],[3,3,3]]) # matrix 3 x 3 >>> b[0,:] # pick first row array([1,1,1]) >>> b[:,0] # we pick the first column array([1,2,3])
Have a look at the following code:
# show original image plt.figure() plt.gray() plt.subplot(121) plt.imshow(bug) # show 'zoomed' region zbug = bug[100:350,140:350]
Here we zoom into the particular portion of the whole image. Remember that the image is just a multidimensional array represented as a NumPy array. Zooming here means selecting a range of rows and columns from this matrix. So we select a partial matrix from rows 100 to 250 and columns 140 to 350. Remember that indexing starts at 0, so the row at coordinate 100 is the 101st row.
Take a look at the following code:
plt.subplot(122) plt.imshow(zbug) plt.show()
This will be displayed as shown here:
For large images, we recommend using numpy.memmap
for memory mapping of images. This will speed up manipulating the image data. Have a look at the following code as an example of this:
import numpy file_name = 'stinkbug.png' image = numpy.memmap(file_name, dtype=numpy.uint8, shape = (375, 500))
Here we load part of a large file into memory, accessing it as a NumPy array. This is very efficient and allows us to manipulate file data structures as standard NumPy arrays without loading everything into memory. The argument shape defines the shape of the array loaded from the file_name
argument, which is a file-like object. Note that this is a concept similar to Python's mmap
argument (available at http://docs.python.org/2/library/mmap.html) but is different in a very important way. NumPy's memmap
attribute returns an array-like object while Python's mmap
returns a file-like object. So, the way we use them is very different yet very natural in each environment.
There are some specialized packages that just focus on image processing like scikit-image (available at http://scikit-image.org/); this is basically a free collection of algorithms for image processing built on top of NumPy/SciPy libraries. If you want to do edge detection, remove noise from an image, or find contours, scikit is the tool to use to look for algorithms. The best way to start is to look at the example gallery and find the example image and code (available at http://scikit-image.org/docs/dev/auto_examples/).