Understanding linear SVM algorithm

In Chapter 2, Supervised and Unsupervised Learning Algorithms, we covered the SVM algorithm and now have an idea of how the SVM model works. A linear support vector machine or linear SVM is a linear classifier that tries to find a hyperplane with the largest margin that splits the input space into two regions.

A hyperplane is a generalization of a plane. In one dimension, a hyperplane is called a point. In two dimensions, it is a line. In three dimensions, it is a plane. In more dimensions, you can call it a hyperplane.

As we saw, the goal of SVM is to identify the hyperplane that tries to find the largest margin that splits the input space into two regions. If the input space is linearly separable, it is easy to separate them. However, in real life, we find that the input space is very non-linear:

In the preceding scenario, the SVM can help us separate the red and blue balls by using what is called a Kernel Trick, which is the method of using a linear classifier to solve a non-linear problem.

The kernel function is applied to each data instance to map the original non-linear observations into a higher-dimensional space in which they become separable.

The most popular kernel functions available are as follows:

The linear kernel
The polynomial kernel
The RBF (Gaussian) kernel
The string kernel

The linear kernel is often recommended for text classification, as most text classification problems need to be categorized into two classes. In our example, we also want to classify the SMS messages into spam and non-spam.

Table of Contents for Understanding linear SVM algorithm

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding linear SVM algorithm