Linear SVM : Example
Dataset
| Point | x1 | x2 | Class y |
|---|---|---|---|
| A | 1 | 2 | -1 |
| B | 2 | 2 | -1 |
| C | 4 | 2 | +1 |
| D | 5 | 2 | +1 |
We need to find:
1. Optimal hyperplane
2. Support vectors
3. Margin
4. Prediction rule
Step 1: General Hyperplane Equation
For Linear SVM:
wᵀx + b = 0
In 2D:
w1x1 + w2x2 + b = 0
Our data points all have:
x2 = 2
So separation mainly happens using x1.
Therefore, the boundary should be vertical.
So we can assume:
w2 = 0
Then equation becomes:
w1x1 + b = 0
Step 2: Identify the Closest Opposite-Class Points
Class -1 points:
A(1,2), B(2,2)
Class +1 points:
C(4,2), D(5,2)
The closest opposite-class points are:
B(2,2) and C(4,2)
Distance between them:
Distance = 4 - 2 = 2
So these two points will lie on the margin lines.
Step 3: SVM Margin Conditions
For support vectors:
y(wᵀx + b) = 1
For negative support vector B(2,2), y = -1:
-1(w1(2) + b) = 1
-(2w1 + b) = 1
2w1 + b = -1
Equation 1:
2w1 + b = -1
For positive support vector C(4,2), y = +1:
+1(w1(4) + b) = 1
4w1 + b = 1
Equation 2:
4w1 + b = 1
Step 4: Solve the Equations
We have:
2w1 + b = -1
4w1 + b = 1
Subtract Equation 1 from Equation 2:
(4w1 + b) - (2w1 + b) = 1 - (-1)
2w1 = 2
w1 = 1
Now substitute into Equation 1:
2(1) + b = -1
2 + b = -1
b = -3
So:
w1 = 1
w2 = 0
b = -3
Therefore:
w = [1, 0]
b = -3
Step 5: Final Hyperplane
The hyperplane is:
wᵀx + b = 0
Substitute values:
1(x1) + 0(x2) - 3 = 0
x1 - 3 = 0
So final decision boundary is:
x1 = 3
Step 6: Margin Lines
SVM has two margin lines:
wᵀx + b = +1
and
wᵀx + b = -1
Using our values:
x1 - 3 = +1
x1 = 4
Positive margin line:
x1 = 4
Negative margin line:
x1 - 3 = -1
x1 = 2
So:
Negative margin line = x1 = 2
Decision boundary = x1 = 3
Positive margin line = x1 = 4
This matches:
B(2,2) lies on x1 = 2
C(4,2) lies on x1 = 4
So B and C are support vectors.
Step 7: Check All Points Using SVM Condition
SVM condition:
y(wᵀx + b) ≥ 1
Here:
f(x) = x1 - 3
Point A(1,2), y = -1
f(x) = 1 - 3 = -2
y × f(x) = (-1)(-2) = 2
Since:
2 ≥ 1
Correctly classified.
Not support vector because value is greater than 1.
Point B(2,2), y = -1
f(x) = 2 - 3 = -1
y × f(x) = (-1)(-1) = 1
Since:
1 = 1
Correctly classified and lies exactly on margin.
So:
B(2,2) is a support vector.
Point C(4,2), y = +1
f(x) = 4 - 3 = 1
y × f(x) = (+1)(+1) = 1
Since:
1 = 1
Correctly classified and lies exactly on margin.
So:
C(4,2) is a support vector.
Point D(5,2), y = +1
f(x) = 5 - 3 = 2
y × f(x) = (+1)(+2) = 2
Since:
2 ≥ 1
Correctly classified.
Not support vector because value is greater than 1.
Step 8: Support Vectors
Support vectors satisfy:
y(wᵀx + b) = 1
From calculation:
| Point | y | f(x) | y × f(x) | Support Vector? |
|---|---|---|---|---|
| A(1,2) | -1 | -2 | 2 | No |
| B(2,2) | -1 | -1 | 1 | Yes |
| C(4,2) | +1 | +1 | 1 | Yes |
| D(5,2) | +1 | +2 | 2 | No |
Therefore:
Support Vectors = B(2,2), C(4,2)
Step 9: Margin Calculation
Margin width formula:
Margin width = 2 / ||w||
Here:
w = [1, 0]
So:
||w|| = √(1² + 0²)
||w|| = 1
Therefore:
Margin width = 2 / 1
Margin width = 2
So total margin width is:
2 units
Distance from decision boundary to each margin line:
1 unit
Step 10: Prediction Rule
Final model:
f(x) = x1 - 3
Prediction rule:
If f(x) > 0 → Class +1
If f(x) < 0 → Class -1
If f(x) = 0 → Point lies on boundary
Step 11: Predict New Points
New Point P(6,2)
f(x) = 6 - 3 = 3
Since:
f(x) > 0
Prediction:
Class +1
New Point Q(1.5,2)
f(x) = 1.5 - 3 = -1.5
Since:
f(x) < 0
Prediction:
Class -1
New Point R(3.5,2)
f(x) = 3.5 - 3 = 0.5
Since:
f(x) > 0
Prediction:
Class +1
But this point is close to the boundary, so the confidence is lower.
Final Result
Optimal Hyperplane: x1 - 3 = 0
Decision Boundary: x1 = 3
Negative Margin Line: x1 = 2
Positive Margin Line: x1 = 4
Support Vectors: B(2,2), C(4,2)
Weight Vector: w = [1, 0]
Bias: b = -3
Margin Width: 2
Prediction Function: f(x) = x1 - 3
Important Points
-
Linear SVM finds the hyperplane with maximum margin.
-
The closest points to the decision boundary are support vectors.
-
Support vectors satisfy y(wᵀx + b) = 1.
-
Non-support vectors satisfy y(wᵀx + b) > 1.
-
In this example, B(2,2) and C(4,2) are support vectors.
-
The final decision boundary is x1 = 3.
-
New data is classified using the sign of f(x) = x1 - 3.