Basic image classification

We will start with a small dataset that was collected especially for this book. It has three classes: buildings, natural scenes (landscapes), and pictures of text. There are 30 images in each category, and they were all taken using a cell phone camera with minimal composition. The images are similar to those that would be uploaded to a modern website by users with no photography training. This dataset is available in the companion code repository. Later in this chapter, we will look at a larger dataset with more images and more categories that are more difficult to classify.

When classifying images, we start with a large rectangular array of numbers (pixel values). Nowadays, millions of pixels are common. We could try to feed all these numbers as features into the learning algorithm. This is not a very good idea unless you have a lot of data. This is because the relationship of each pixel (or even each small group of pixels) to the final result is very indirect. Also, having millions of pixels, but only as a small number of example images, results in a very hard statistical learning problem. This is an extreme form of the P greater than N type of problem we discussed in Chapter 3, Regression. Instead, a good approach for smaller problems is to compute features from the image and use those features for classification.

We previously used an example of the scene class. The following are examples of the text and building classes:

Table of Contents for Basic image classification

Create new playlist

Sign In

Sign Up

Table of Contents for
Basic image classification