Linear SVM : Example - Machine Learning

Dataset

Point	x1	x2	Class y
A	1	2	-1
B	2	2	-1
C	4	2	+1
D	5	2	+1

We need to find:

1. Optimal hyperplane
2. Support vectors
3. Margin
4. Prediction rule

Step 1: General Hyperplane Equation

For Linear SVM:

wᵀx + b = 0

In 2D:

w1x1 + w2x2 + b = 0

Our data points all have:

x2 = 2

So separation mainly happens using x1.

Therefore, the boundary should be vertical.

So we can assume:

w2 = 0

Then equation becomes:

w1x1 + b = 0

Step 2: Identify the Closest Opposite-Class Points

Class -1 points:

A(1,2), B(2,2)

Class +1 points:

C(4,2), D(5,2)

The closest opposite-class points are:

B(2,2) and C(4,2)

Distance between them:

Distance = 4 - 2 = 2

So these two points will lie on the margin lines.

Step 3: SVM Margin Conditions

For support vectors:

y(wᵀx + b) = 1

For negative support vector B(2,2), y = -1:

-1(w1(2) + b) = 1

-(2w1 + b) = 1

2w1 + b = -1

Equation 1:

2w1 + b = -1

For positive support vector C(4,2), y = +1:

+1(w1(4) + b) = 1

4w1 + b = 1

Equation 2:

4w1 + b = 1

Step 4: Solve the Equations

We have:

2w1 + b = -1
4w1 + b = 1

Subtract Equation 1 from Equation 2:

(4w1 + b) - (2w1 + b) = 1 - (-1)

2w1 = 2

w1 = 1

Now substitute into Equation 1:

2(1) + b = -1

2 + b = -1

b = -3

So:

w1 = 1
w2 = 0
b = -3

Therefore:

w = [1, 0]
b = -3

Step 5: Final Hyperplane

The hyperplane is:

wᵀx + b = 0

Substitute values:

1(x1) + 0(x2) - 3 = 0

x1 - 3 = 0

So final decision boundary is:

x1 = 3

Step 6: Margin Lines

SVM has two margin lines:

wᵀx + b = +1

and

wᵀx + b = -1

Using our values:

x1 - 3 = +1

x1 = 4

Positive margin line:

x1 = 4

Negative margin line:

x1 - 3 = -1

x1 = 2

So:

Negative margin line = x1 = 2
Decision boundary = x1 = 3
Positive margin line = x1 = 4

This matches:

B(2,2) lies on x1 = 2
C(4,2) lies on x1 = 4

So B and C are support vectors.

Step 7: Check All Points Using SVM Condition

SVM condition:

y(wᵀx + b) ≥ 1

Here:

f(x) = x1 - 3

Point A(1,2), y = -1

f(x) = 1 - 3 = -2

y × f(x) = (-1)(-2) = 2

Since:

2 ≥ 1

Correctly classified.

Not support vector because value is greater than 1.

Point B(2,2), y = -1

f(x) = 2 - 3 = -1

y × f(x) = (-1)(-1) = 1

Since:

1 = 1

Correctly classified and lies exactly on margin.

So:

B(2,2) is a support vector.

Point C(4,2), y = +1

f(x) = 4 - 3 = 1

y × f(x) = (+1)(+1) = 1

Since:

1 = 1

Correctly classified and lies exactly on margin.

So:

C(4,2) is a support vector.

Point D(5,2), y = +1

f(x) = 5 - 3 = 2

y × f(x) = (+1)(+2) = 2

Since:

2 ≥ 1

Correctly classified.

Not support vector because value is greater than 1.

Step 8: Support Vectors

Support vectors satisfy:

y(wᵀx + b) = 1

From calculation:

Point	y	f(x)	y × f(x)	Support Vector?
A(1,2)	-1	-2	2	No
B(2,2)	-1	-1	1	Yes
C(4,2)	+1	+1	1	Yes
D(5,2)	+1	+2	2	No

Therefore:

Support Vectors = B(2,2), C(4,2)

Step 9: Margin Calculation

Margin width formula:

Margin width = 2 / ||w||

Here:

w = [1, 0]

So:

||w|| = √(1² + 0²)

||w|| = 1

Therefore:

Margin width = 2 / 1

Margin width = 2

So total margin width is:

2 units

Distance from decision boundary to each margin line:

1 unit

Step 10: Prediction Rule

Final model:

f(x) = x1 - 3

Prediction rule:

If f(x) > 0 → Class +1
If f(x) < 0 → Class -1
If f(x) = 0 → Point lies on boundary

Step 11: Predict New Points

New Point P(6,2)

f(x) = 6 - 3 = 3

Since:

f(x) > 0

Prediction:

Class +1

New Point Q(1.5,2)

f(x) = 1.5 - 3 = -1.5

Since:

f(x) < 0

Prediction:

Class -1

New Point R(3.5,2)

f(x) = 3.5 - 3 = 0.5

Since:

f(x) > 0

Prediction:

Class +1

But this point is close to the boundary, so the confidence is lower.

Final Result

Optimal Hyperplane: x1 - 3 = 0

Decision Boundary: x1 = 3

Negative Margin Line: x1 = 2

Positive Margin Line: x1 = 4

Support Vectors: B(2,2), C(4,2)

Weight Vector: w = [1, 0]

Bias: b = -3

Margin Width: 2

Prediction Function: f(x) = x1 - 3

Important Points

Linear SVM finds the hyperplane with maximum margin.
The closest points to the decision boundary are support vectors.
Support vectors satisfy y(wᵀx + b) = 1.
Non-support vectors satisfy y(wᵀx + b) > 1.
In this example, B(2,2) and C(4,2) are support vectors.
The final decision boundary is x1 = 3.
New data is classified using the sign of f(x) = x1 - 3.