So far in this chapter, we have seen how we would use TensorFlow to train a convolutional neural network for the task of image classification. After we trained our model, we ran it through the test set, which was stored away at the start, to see how well it would perform on data it had never seen before. This process of evaluating our model on a test set gives us an indication of how well our model will generalize when we deploy it. A model that generalizes well is clearly a desirable property to have, as it allows it to be used in many situations.
What CNN architecture we use is one of the ways that we can improve the generalization ability of our model. One simple technique to keep in mind is to start by designing your model as simply as possible with few layers or filters. Since very small models are more likely to underfit to your data, you can slowly add complexity until underfitting stops occurring. If you design your models this way, it limits the possibility that overfitting ever occurs, as you don’t allow yourself to have a model that is too large for your dataset.
In this section, however, we will explore some of the other things we can do to make a better machine learning model and how to incorporate them into our training procedure. The following methods aim to prevent overfitting and by doing so, help create a more robust model that generalizes better. The process of preventing our model from overfitting is called regularization.