Architecture of VGGNet

VGGNet is one of the most popularly used CNN architectures. It was invented by the Visual Geometry Group (VGG) at the University of Oxford. It started to get very popular when it became the first runner-up of ILSVRC 2014.

It is basically a deep convolutional network and is widely used for object-detection tasks. The weights and structure of the network are made available to the public by the Oxford team, so we can use these weights directly to carry out several computer vision tasks. It is also widely used as a good baseline feature extractor for images.

The architecture of the VGG network is very simple. It consists of convolutional layers followed by a pooling layer. It uses 3 x 3 convolution and 2 x 2 pooling throughout the network. It is referred to as VGG-n, where n corresponds to a number of layers, excluding the pooling and softmax layer. The following figure shows the architecture of the VGG-16 network:

As you can see in the following figure, the architecture of AlexNet is characterized by a pyramidal shape, as the initial layers are wide and the later layers are narrow. You will notice it consists of multiple convolutional layers followed by a pooling layer. Since the pooling layer reduces the spatial dimension, it narrows the network as we go deeper into the network:

The one shortcoming of VGGNet is that it is computationally expensive, and it has over 160 million parameters.

Table of Contents for Architecture of VGGNet

Create new playlist

Sign In

Sign Up

Table of Contents for
Architecture of VGGNet