There are various ways to improve this model based on the approach we took in this chapter. Here are some of the specific aspects that could be improved upon:
- Using a better image feature extraction model, such as Google's Inception model
- Higher-resolution and better-quality training images (needs for GPU power!)
- More training data based on datasets such as Flickr30K or even image augmentation
- Introducing attention in models
If you have the necessary data and infrastructure, these are some ideas worth exploring!