Understanding AlexNet

AlexNet is a classic and powerful deep learning architecture. It won the ILSVRC 2012 by significantly reducing the error rate from 26% to 15.3%. ILSVRC stands for ImageNet Large Scale Visual Recognition Competition, which is one of the biggest competitions focused on computer vision tasks, such as image classification, localization, object detection, and more. ImageNet is a huge dataset containing over 15 million labeled, high-resolution images, with over 22,000 categories. Every year, researchers compete to win the competition using innovative architecture.

AlexNet was designed by pioneering scientists, including Alex Krizhevsky, Geoffrey Hinton, and Ilya Sutskever. It consists of five convolutional layers and three fully connected layers, as shown in the following diagram. It uses the ReLU activation function instead of the tanh function, and ReLU is applied after every layer. It uses dropout to handle overfitting, and dropout is performed before the first and second fully connected layers. It uses data augmentation techniques, such as image translation, and is trained using batch stochastic gradient descent on two GTX 580 GPUs for 5 to 6 days:

Table of Contents for Understanding AlexNet

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding AlexNet