There's more...

In this exercise, we have seen how to use various kernels in our code. Kernel functions must be symmetrical. Preferably, they should have a positive (semi) definite gram matrix. A gram matrix is the matrix of all the possible inner products of V, where V is the set of m vectors. For convenience, we consider positive semi-definite and positive-definite functions indifferently. In practice, a positive definiteness of kernel matrices ensures that kernel algorithms converge to a unique solution.

A linear kernel is the simplest of all kernels available. It works well with text classification problems. 

A linear kernel is presented as follows:

Here, c is the constant term.

A polynomial kernel has two parameters: a constant and the degree. A polynomial kernel with no constant and a degree of 1 is simply a linear kernel. As the degree of the polynomial kernel increases, the decision function becomes more complex. With higher degrees, it is possible to get good training accuracy, but the model might fail to generalize to unseen data, leading to overfitting. The polynomial kernel is represented as follows:

Here,  is the slope, d is the degree of the kernel, and c is the constant term.

The radial basis function kernel (RBF), also known as the Gaussian kernel, is a more complicated kernel and can outperform polynomial kernels. The RBF kernel is given as follows:

The  parameter can be tuned to increase the performance of the kernel. This is important: with an over-estimated , the kernel can lose its non-linear power and behave more linearly. On the other hand, if  is underestimated, the decision function can be highly sensitive to noise in the training data.

Not all kernels are strictly positive-definite. The sigmoid kernel function, though is quite widely used, is not positive-definite. The sigmoid function is given as follows:

Here,  is the slope and c is the constant term. Note that an SVM with a sigmoid kernel is the same as a two-layer perceptron neural network.

Adding a kernel trick to an SVM model can give us new models. How do we choose which kernel to use? The first approach is to try out the RBF kernel, since it works pretty well most of the time. However, it is a good idea to use other kernels and validate your results. Using the right kernel with the right dataset can help you build the best SVM models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset