The need for transfer learning

We have already briefly discussed the advantages of transfer learning, in Chapter 4, Transfer Learning Fundamentals. To recap, we get several benefits, such as improving the baseline performance, speeding up the overall model development and training time, and also getting an overall improved and superior model performance as compared to building a deep learning model from scratch. An important thing to remember here is that transfer learning as a domain existed long before deep learning and can also be applied to areas or problems that do not need deep learning.

Let's consider a real-world problem now, one which we will also be using throughout this chapter to illustrate our different deep learning models and leverage transfer learning on the same. One of the key requirements of deep learning, which you must have heard time and again, is that we need a lot of data and samples to build robust deep learning models. The idea behind this is that the model can then learn features automatically from a large number of samples. But what do we do if we don't have enough training samples and the problem to solve is still a relatively complex problem? For instance, a computer vision problem, such as image categorization, which might be difficult to solve using traditional statistical or machine learning (ML) techniques. Do we give up on deep learning?

Considering an image categorization problem, since we are dealing with images that are essentially high-dimensional tensors, having more data enables deep learning models to learn better underlying feature representations of the images. However, even if we have image samples ranging from a few hundred to thousands per category, a basic CNN model will still perform decently with the right architecture and regularization. The key point to remember here is the fact that CNNs learn patterns and features that are invariant to scaling, translation, and rotation, and hence we do not need custom feature engineering techniques here. However, we might still run into problems like model overfitting, which we will try to address later on in this chapter.

With regard to transfer learning, there are some excellent pretrained deep learning models that have been trained on the famous ImageNet dataset (http://image-net.org/about-overview). We have covered some of these models in detail in Chapter 3Understanding Deep Learning Architectures, and we will be leveraging the famous VGG-16 model in this chapter. The idea is to use a pretrained model that is usually an expert in image categorization to solve our problem with the constraint of fewer data samples.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset