3
Sep

How SVM (Support Vector Machine) algorithm works


Hello, I will explain how SVM algorithm works.
This video will explain the support vector machine for linearly separable binary sets
Suppose we have this two features, x1 and x2 here and we want to classify all this elements
You can see that we have the class square and the class rectangle
So the goal of the SVM is to design a hyperplane, here we define this green line as the hyperplane,
that classifies all the training vectors in two classes
Here we show two different hyperplanes which can classify correctly all the instances in
this feature set But the best choice will be the hyperplane
that leaves the maximum margin from both classes The margin is this distance between the hyperplane
and the closest elements from this hyperplane We have the case of the red hyperplane we
have this distance, so this is the margin, which we represent by z1
And in the case of the green hyperplane we have the margin that we call z2
We can clearly see that the value of z2 is greater than z1
So the margin is higher in the case of the green hyperplane, so in this case the best
choice will be the green hyperplane Suppose we have this hyperplane, this hyperplane
is defined by one equation, we can state this equation as this one
We have a vector of weights plus omega 0 and this equation will deliver values greater
than 1 for all the input vectors which belongs to the class 1, in this case the circles
And also, we scale this hyperplane so that it will deliver values smaller than -1 for
all values which belongs to class number 2, the rectangles
We can say that this distance to the closest elements will be at least 1, the modulus is
1 From the geometry we know that the distance
between a point and a hyperplane is computed by this equation
So the total margin which is composed by this distance will be computed by this equation
And the aim is that minimizing this term will maximize the separability
When we minimize this weight vector we will have the biggest margin here that will split
this two classes To minimize this weight vector is a nonlinear
optimization task, which can be solved by this conditions (KKT), which uses Langrange
multipliers The main equations state that the value of
omega will be the solution of this sum here And we also have this other rule. So when
we solve these equations, trying to minimize this omega vector, we will maximize the margin
between the two classes which will maximize the separability the two classes
Here we show a simple example Suppose we have these 2 features, x1 and x2,
and we have these 3 values We want to design, or to find the best hyperplane
that will divide this 2 classes So we know that we can see clearly from this
graph that the best division line will be a parallel line to the line that connects
these 2 values here So we can define this weight vector, which
is this point minus this other point. So we have the constant a and 2 times this constant
a Now we can solve this weight vector and create
the hyperplane equations considering this weight vector
We must discover the values of this a here Since we have this weight vector omega here,
we can substitute the values of this point and also using this point we can substitute
these 2 values here When we place the equation g using the input
vector (1,1) we know that we have the value -1 because this belongs to the class circle
So we will have this value here, when we use the second point, we apply the function and
we know that it will deliver the value 1 So we substitute here in the equation also
Well, given 2 equations we can isolate the value of omega 0 in the second equation and
we will have omega 0 equal to 1 minus 8 times a
So, using this value, we put the omega 0 in the first equation and we will reach the value
of a, which is 2 divided by 5 Now we discover the value of a and now we
substitute the first equation and also discover the value of omega 0
So by dividing here we will come to the conclusion that omega 0 is minus 11 divided by 5 and
since we know that the weight vector is a and 2 a we can substitute the value of a here
and we will deliver these values of the weight vector
So in this case, these are called the support vectors because they compose the omega value
2 divided by 5 and 4 divided by 5 And we substitute here the values of omega
(2 divided by 5 and 4 divided by 5) and also the omega 0 value we will deliver the final
equation which defines this green hyperplane which is x1 plus 2 times x2 minus 5.5
And this hyperplane classifies the elements using support vector machines
These are some references that we have used So this is how SVM algorithm works

Tags: , , , , , , , , , , , , , , , , ,

100 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *