This works fine for examples where the classes can be linearly separated as above.
This is rarely the case, often we’ll have something like the case below, where the classes are not linearly separable.
In this case, the we would need to transform from this space to another dimension. Kernel functions can provide the dot product of vectors in another space without having to know the transformation into this space. A couple of popular kernels are the ‘linear’ kernel and the ‘rbf’ kernel. We would get a transformation into 3-dimensional space, similar to what’s shown below.
From this, we can now linearly separate the classes as shown by the plane in the graph below.
This plane can then transformed to display a non-linear decision boundary in 2-dimensions.