PCA Example
PCA Problem
Given data:
| Point | X | Y |
|---|---|---|
| A | 2 | 0 |
| B | 0 | 2 |
| C | 3 | 1 |
| D | 1 | 3 |
Goal: Reduce 2D data into 1D using PCA.
Step 1: Calculate Mean
Mean of X = (2 + 0 + 3 + 1) / 4 = 6 / 4 = 1.5
Mean of Y = (0 + 2 + 1 + 3) / 4 = 6 / 4 = 1.5
Step 2: Center the Data
Subtract mean from each value.
| Point | X | Y | X - 1.5 | Y - 1.5 |
|---|---|---|---|---|
| A | 2 | 0 | 0.5 | -1.5 |
| B | 0 | 2 | -1.5 | 0.5 |
| C | 3 | 1 | 1.5 | -0.5 |
| D | 1 | 3 | -0.5 | 1.5 |
Centered matrix:
X_centered =
[ 0.5 -1.5 ]
[ -1.5 0.5 ]
[ 1.5 -0.5 ]
[ -0.5 1.5 ]
Step 3: Calculate Covariance Matrix
The covariance matrix describes how each feature varies individually and how the features vary with respect to one another.
The covariance matrix has the following structure:
| Variance(X) Covariance(X,Y) |
Covariance = | |
| Covariance(Y,X) Variance(Y) |
First, calculate the Variance of X.
Variance(X)
= (0.5² + (-1.5)² + 1.5² + (-0.5)²) / (4 - 1)
= (0.25 + 2.25 + 2.25 + 0.25) / 3
= 5 / 3
= 1.6667
Next, calculate the Variance of Y.
Variance(Y)
= ((-1.5)² + 0.5² + (-0.5)² + 1.5²) / (4 - 1)
= (2.25 + 0.25 + 0.25 + 2.25) / 3
= 5 / 3
= 1.6667
Now calculate the Covariance between X and Y.
Cov(X,Y)
= [(0.5)(-1.5) + (-1.5)(0.5) + (1.5)(-0.5) + (-0.5)(1.5)] / (4 - 1)
= (-0.75 - 0.75 - 0.75 - 0.75) / 3
= -3 / 3
= -1
Since covariance is symmetric,
Cov(Y,X) = Cov(X,Y) = -1
Finally, substitute these values into the covariance matrix.
| Variance(X) Cov(X,Y) |
Covariance = | |
| Cov(Y,X) Variance(Y) |
| 1.6667 -1.0000 |
Covariance = | |
| -1.0000 1.6667 |
Therefore, the covariance matrix is:
[ 1.6667 -1.0000 ]
[ -1.0000 1.6667 ]
Or you can also use the the following formula with matrix multiplication.
Formula:
Covariance Matrix = (XᵀX) / (n - 1)
Here:
n = 4
n - 1 = 3
Step 4: Find Eigenvalues
Covariance matrix:
A =
[ 1.6667 -1.0000 ]
[ -1.0000 1.6667 ]
To find eigenvalues, solve:
|A - λI| = 0
That means:
| 1.6667 - λ -1 |
| -1 1.6667 - λ | = 0
For a 2 × 2 matrix:
|a b|
|c d| = ad - bc
So:
(1.6667 - λ)(1.6667 - λ) - (-1)(-1) = 0
(1.6667 - λ)² - 1 = 0
Now solve:
(1.6667 - λ)² = 1
Take square root:
1.6667 - λ = ±1
Case 1
1.6667 - λ = 1
λ = 0.6667
Case 2
1.6667 - λ = -1
λ = 2.6667
So eigenvalues are:
λ₁ = 2.6667
λ₂ = 0.6667
The larger eigenvalue is:
2.6667
This means the first principal component corresponds to:
λ = 2.6667
Step 5: Find Eigenvector for λ = 2.6667
Eigenvector formula:
(A - λI)v = 0
Let:
v = [v₁, v₂]
Substitute:
λ = 2.6667
A - λI =
[ 1.6667 - 2.6667 -1 ]
[ -1 1.6667 - 2.6667 ]
=
[ -1 -1 ]
[ -1 -1 ]
Now solve:
[ -1 -1 ] [v₁] = 0
[ -1 -1 ] [v₂] = 0
This gives:
-v₁ - v₂ = 0
So:
v₁ + v₂ = 0
Therefore:
v₂ = -v₁
Choose:
v₁ = 1
Then:
v₂ = -1
So one eigenvector is:
[ 1 ]
[-1 ]
But PCA uses a unit vector, so we normalize it.
Length:
√(1² + (-1)²)
= √2
= 1.4142
Normalized eigenvector:
[ 1 / √2 ]
[-1 / √2 ]
=
[ 0.7071 ]
[ -0.7071 ]
So,
PC1 = [0.7071, -0.7071]
Step 6: Find Eigenvector for λ = 0.6667
Again:
(A - λI)v = 0
Substitute:
λ = 0.6667
A - λI =
[ 1.6667 - 0.6667 -1 ]
[ -1 1.6667 - 0.6667 ]
=
[ 1 -1 ]
[-1 1 ]
Now solve:
[ 1 -1 ] [v₁] = 0
[-1 1 ] [v₂] = 0
This gives:
v₁ - v₂ = 0
So:
v₁ = v₂
Choose:
v₁ = 1
Then:
v₂ = 1
So one eigenvector is:
[1]
[1]
Normalize it.
Length:
√(1² + 1²)
= √2
= 1.4142
Normalized eigenvector:
[1 / √2]
[1 / √2]
=
[0.7071]
[0.7071]
So,
PC2 = [0.7071, 0.7071]
Step 7: Select Principal Component
Eigenvalues:
| Eigenvalue | Eigenvector | Meaning |
|---|---|---|
| 2.6667 | [0.7071, -0.7071] | Maximum variance |
| 0.6667 | [0.7071, 0.7071] | Less variance |
The largest eigenvalue is:
2.6667
So we select:
PC1 = [0.7071, -0.7071]
This is the direction where the data varies the most.
Step 8: Project Data onto PC1
Projection formula:
Projected Value = Centered Point · PC1
PC1:
[0.7071, -0.7071]
Point A
Centered A:
[0.5, -1.5]
Projection:
= (0.5 × 0.7071) + (-1.5 × -0.7071)
= 0.35355 + 1.06065
= 1.4142
Point B
Centered B:
[-1.5, 0.5]
Projection:
= (-1.5 × 0.7071) + (0.5 × -0.7071)
= -1.06065 - 0.35355
= -1.4142
Point C
Centered C:
[1.5, -0.5]
Projection:
= (1.5 × 0.7071) + (-0.5 × -0.7071)
= 1.06065 + 0.35355
= 1.4142
Point D
Centered D:
[-0.5, 1.5]
Projection:
= (-0.5 × 0.7071) + (1.5 × -0.7071)
= -0.35355 - 1.06065
= -1.4142
Final 1D Data
| Point | Original Data | Centered Data | 1D PCA Value |
|---|---|---|---|
| A | (2, 0) | (0.5, -1.5) | 1.4142 |
| B | (0, 2) | (-1.5, 0.5) | -1.4142 |
| C | (3, 1) | (1.5, -0.5) | 1.4142 |
| D | (1, 3) | (-0.5, 1.5) | -1.4142 |
In Summary,
PCA first centers the data, then calculates the covariance matrix to understand how the features vary together.
The eigenvectors of the covariance matrix give the directions of maximum variance.
The eigenvalues tell us how much variance exists in each direction.
Since λ = 2.6667 is larger than λ = 0.6667, we select its eigenvector:
[0.7071, -0.7071]
This becomes the first principal component.
Finally, we project the original centered data onto this principal component to convert the data from 2D into 1D.
Python Program
import numpy as np
from sklearn.decomposition import PCA
# Original data
X = np.array([
[2, 0],
[0, 2],
[3, 1],
[1, 3]
])
print("Original Data:")
print(X)
# Apply PCA to reduce 2D data into 1D
pca = PCA(n_components=1)
X_pca = pca.fit_transform(X)
print("\n1D PCA Output:")
print(X_pca)
print("\nPrincipal Component:")
print(pca.components_)
print("\nExplained Variance:")
print(pca.explained_variance_)