PCA Example - Machine Learning

PCA Problem

Given data:

Point	X	Y
A	2	0
B	0	2
C	3	1
D	1	3

Goal: Reduce 2D data into 1D using PCA.

Step 1: Calculate Mean

Mean of X = (2 + 0 + 3 + 1) / 4 = 6 / 4 = 1.5

Mean of Y = (0 + 2 + 1 + 3) / 4 = 6 / 4 = 1.5

Step 2: Center the Data

Subtract mean from each value.

Point	X	Y	X - 1.5	Y - 1.5
A	2	0	0.5	-1.5
B	0	2	-1.5	0.5
C	3	1	1.5	-0.5
D	1	3	-0.5	1.5

Centered matrix:

X_centered =

[  0.5   -1.5 ]
[ -1.5    0.5 ]
[  1.5   -0.5 ]
[ -0.5    1.5 ]

Step 3: Calculate Covariance Matrix

The covariance matrix describes how each feature varies individually and how the features vary with respect to one another.

The covariance matrix has the following structure:

               | Variance(X)      Covariance(X,Y) |
Covariance =   |                                  |
               | Covariance(Y,X)   Variance(Y)    |

First, calculate the Variance of X.

Variance(X)

= (0.5² + (-1.5)² + 1.5² + (-0.5)²) / (4 - 1)

= (0.25 + 2.25 + 2.25 + 0.25) / 3

= 5 / 3

= 1.6667

Next, calculate the Variance of Y.

Variance(Y)

= ((-1.5)² + 0.5² + (-0.5)² + 1.5²) / (4 - 1)

= (2.25 + 0.25 + 0.25 + 2.25) / 3

= 5 / 3

= 1.6667

Now calculate the Covariance between X and Y.

Cov(X,Y)

= [(0.5)(-1.5) + (-1.5)(0.5) + (1.5)(-0.5) + (-0.5)(1.5)] / (4 - 1)

= (-0.75 - 0.75 - 0.75 - 0.75) / 3

= -3 / 3

= -1

Since covariance is symmetric,

Cov(Y,X) = Cov(X,Y) = -1

Finally, substitute these values into the covariance matrix.

               | Variance(X)      Cov(X,Y) |
Covariance =   |                           |
               | Cov(Y,X)      Variance(Y) |

               | 1.6667   -1.0000 |
Covariance =   |                  |
               | -1.0000  1.6667  |

Therefore, the covariance matrix is:

[ 1.6667   -1.0000 ]
[ -1.0000   1.6667 ]

Or you can also use the the following formula with matrix multiplication.

Formula:

Covariance Matrix = (XᵀX) / (n - 1)

Here:

n = 4

n - 1 = 3

Step 4: Find Eigenvalues

Covariance matrix:

A =

[  1.6667   -1.0000 ]
[ -1.0000    1.6667 ]

To find eigenvalues, solve:

|A - λI| = 0

That means:

| 1.6667 - λ      -1       |
|    -1        1.6667 - λ  | = 0

For a 2 × 2 matrix:

|a b|
|c d| = ad - bc

So:

(1.6667 - λ)(1.6667 - λ) - (-1)(-1) = 0

(1.6667 - λ)² - 1 = 0

Now solve:

(1.6667 - λ)² = 1

Take square root:

1.6667 - λ = ±1

Case 1

1.6667 - λ = 1

λ = 0.6667

Case 2

1.6667 - λ = -1

λ = 2.6667

So eigenvalues are:

λ₁ = 2.6667

λ₂ = 0.6667

The larger eigenvalue is:

2.6667

This means the first principal component corresponds to:

λ = 2.6667

Step 5: Find Eigenvector for λ = 2.6667

Eigenvector formula:

(A - λI)v = 0

Let:

v = [v₁, v₂]

Substitute:

λ = 2.6667

A - λI =

[ 1.6667 - 2.6667      -1             ]
[      -1          1.6667 - 2.6667    ]

=

[ -1   -1 ]
[ -1   -1 ]

Now solve:

[ -1   -1 ] [v₁] = 0
[ -1   -1 ] [v₂] = 0

This gives:

-v₁ - v₂ = 0

So:

v₁ + v₂ = 0

Therefore:

v₂ = -v₁

Choose:

v₁ = 1

Then:

v₂ = -1

So one eigenvector is:

[ 1 ]
[-1 ]

But PCA uses a unit vector, so we normalize it.

Length:

√(1² + (-1)²)

= √2

= 1.4142

Normalized eigenvector:

[ 1 / √2  ]
[-1 / √2  ]

=

[  0.7071 ]
[ -0.7071 ]

So,

PC1 = [0.7071, -0.7071]

Step 6: Find Eigenvector for λ = 0.6667

Again:

(A - λI)v = 0

Substitute:

λ = 0.6667

A - λI =

[ 1.6667 - 0.6667      -1             ]
[      -1          1.6667 - 0.6667    ]

=

[ 1   -1 ]
[-1    1 ]

Now solve:

[ 1   -1 ] [v₁] = 0
[-1    1 ] [v₂] = 0

This gives:

v₁ - v₂ = 0

So:

v₁ = v₂

Choose:

v₁ = 1

Then:

v₂ = 1

So one eigenvector is:

[1]
[1]

Normalize it.

Length:

√(1² + 1²)

= √2

= 1.4142

Normalized eigenvector:

[1 / √2]
[1 / √2]

=

[0.7071]
[0.7071]

So,

PC2 = [0.7071, 0.7071]

Step 7: Select Principal Component

Eigenvalues:

Eigenvalue	Eigenvector	Meaning
2.6667	[0.7071, -0.7071]	Maximum variance
0.6667	[0.7071, 0.7071]	Less variance

The largest eigenvalue is:

2.6667

So we select:

PC1 = [0.7071, -0.7071]

This is the direction where the data varies the most.

Step 8: Project Data onto PC1

Projection formula:

Projected Value = Centered Point · PC1

PC1:

[0.7071, -0.7071]

Point A

Centered A:

[0.5, -1.5]

Projection:

= (0.5 × 0.7071) + (-1.5 × -0.7071)

= 0.35355 + 1.06065

= 1.4142

Point B

Centered B:

[-1.5, 0.5]

Projection:

= (-1.5 × 0.7071) + (0.5 × -0.7071)

= -1.06065 - 0.35355

= -1.4142

Point C

Centered C:

[1.5, -0.5]

Projection:

= (1.5 × 0.7071) + (-0.5 × -0.7071)

= 1.06065 + 0.35355

= 1.4142

Point D

Centered D:

[-0.5, 1.5]

Projection:

= (-0.5 × 0.7071) + (1.5 × -0.7071)

= -0.35355 - 1.06065

= -1.4142

Final 1D Data

Point	Original Data	Centered Data	1D PCA Value
A	(2, 0)	(0.5, -1.5)	1.4142
B	(0, 2)	(-1.5, 0.5)	-1.4142
C	(3, 1)	(1.5, -0.5)	1.4142
D	(1, 3)	(-0.5, 1.5)	-1.4142

In Summary,

PCA first centers the data, then calculates the covariance matrix to understand how the features vary together.

The eigenvectors of the covariance matrix give the directions of maximum variance.

The eigenvalues tell us how much variance exists in each direction.

Since λ = 2.6667 is larger than λ = 0.6667, we select its eigenvector:

[0.7071, -0.7071]

This becomes the first principal component.

Finally, we project the original centered data onto this principal component to convert the data from 2D into 1D.

Python Program

import numpy as np
from sklearn.decomposition import PCA

# Original data
X = np.array([
    [2, 0],
    [0, 2],
    [3, 1],
    [1, 3]
])

print("Original Data:")
print(X)

# Apply PCA to reduce 2D data into 1D
pca = PCA(n_components=1)
X_pca = pca.fit_transform(X)

print("\n1D PCA Output:")
print(X_pca)

print("\nPrincipal Component:")
print(pca.components_)

print("\nExplained Variance:")
print(pca.explained_variance_)