K-Nearest Neighbors - Machine Learning

Brief Idea

K-Nearest Neighbors, commonly called KNN, is a supervised machine learning algorithm used for classification and regression.

In classification, KNN predicts the class of a new data point by looking at the classes of its nearest neighbors.

The main idea is:

Similar data points usually belong to similar classes.

Why KNN is Called a Non-Linear Classifier

KNN does not create one fixed straight-line decision boundary like Logistic Regression or Linear SVM.

Instead, it classifies points based on nearby training examples.

Because of this, KNN can create:

Curved boundaries
Irregular boundaries
Complex boundaries

So KNN is considered a non-linear classifier.

Intuition

Suppose you move to a new city and want to know whether an area is expensive or affordable.

You check nearby houses:

Nearest 5 houses
      ↓
Most are expensive
      ↓
Your area is likely expensive

KNN works in the same way.

For a new data point:

Find nearest K points
      ↓
Check their class labels
      ↓
Choose majority class

Meaning of K in KNN

In KNN, K means the number of nearest neighbors considered for prediction.

Example:

K = 3

means:

Look at 3 nearest training points

If among 3 neighbors:

2 belong to Class A
1 belongs to Class B

Then prediction:

Class A

How KNN Works

Step 1: Store Training Data

KNN does not build a complex model during training.

It simply stores the training data.

That is why KNN is called a:

Lazy Learning Algorithm

Step 2: Take a New Data Point

Suppose a new point needs classification.

Step 3: Calculate Distance

KNN calculates the distance between the new point and all training points.

Common distance metric:

Euclidean Distance

Formula:

Distance = √((x2 - x1)² + (y2 - y1)²)

Step 4: Select K Nearest Neighbors

Sort all distances from smallest to largest.

Select the nearest K points.

Step 5: Majority Voting

Count the class labels among selected K neighbors.

The class with the highest count becomes the prediction.

Example Dataset

Point	x1	x2	Class
A	1	1	Red
B	2	1	Red
C	1	2	Red
D	5	5	Blue
E	6	5	Blue
F	5	6	Blue

New point:

P = (3,3)

Let:

K = 3

We need to classify P.

Step 1: Calculate Distance from P to Each Point

Distance formula:

Distance = √((x2 - x1)² + (y2 - y1)²)

Distance from P(3,3) to A(1,1)

Distance = √((3 - 1)² + (3 - 1)²)
= √(2² + 2²)
= √8
= 2.828

Distance from P(3,3) to B(2,1)

Distance = √((3 - 2)² + (3 - 1)²)
= √(1² + 2²)
= √5
= 2.236

Distance from P(3,3) to C(1,2)

Distance = √((3 - 1)² + (3 - 2)²)
= √(2² + 1²)
= √5
= 2.236

Distance from P(3,3) to D(5,5)

Distance = √((3 - 5)² + (3 - 5)²)
= √((-2)² + (-2)²)
= √8
= 2.828

Distance from P(3,3) to E(6,5)

Distance = √((3 - 6)² + (3 - 5)²)
= √((-3)² + (-2)²)
= √13
= 3.606

Distance from P(3,3) to F(5,6)

Distance = √((3 - 5)² + (3 - 6)²)
= √((-2)² + (-3)²)
= √13
= 3.606

Step 2: Sort Distances

Point	Distance	Class
B	2.236	Red
C	2.236	Red
A	2.828	Red
D	2.828	Blue
E	3.606	Blue
F	3.606	Blue

Step 3: Select K Nearest Neighbors

For:

K = 3

Nearest 3 points are:

Neighbor	Class
B	Red
C	Red
A	Red

Step 4: Majority Voting

Among 3 nearest neighbors:

Red = 3
Blue = 0

So prediction:

P(3,3) = Red

How K Value Affects Prediction

Small K

Example:

K = 1

The model only checks the nearest single point.

This can be sensitive to noise.

Problem:

High variance
Overfitting

Large K

Example:

K = 15

The model considers many points.

This can smooth the boundary too much.

Problem:

High bias
Underfitting

Good K

A balanced K gives better classification.

Common practice:

Use odd K values

Examples:

K = 3, 5, 7

Odd K helps avoid ties in binary classification.

Distance Metrics in KNN

1. Euclidean Distance

Most common distance metric.

Used for continuous numerical data.

Straight-line distance between two points

2. Manhattan Distance

Distance measured by moving only horizontally and vertically.

Formula:

|x2 - x1| + |y2 - y1|

Useful in grid-like data.

3. Minkowski Distance

General form of Euclidean and Manhattan distance.

4. Hamming Distance

Used for categorical or binary features.

Why Feature Scaling is Important

KNN depends on distance.

If one feature has very large values, it dominates distance calculation.

Example:

Feature	Range
Age	18 to 60
Salary	20,000 to 2,00,000

Salary values are much larger, so KNN may ignore age.

To solve this, use:

Standardization
Normalization

KNN Decision Boundary

KNN creates decision boundaries based on local neighborhoods.

Small K:

Complex boundary

Large K:

Smoother boundary

This is why KNN can handle non-linear classification problems.

Advantages

Simple to understand
No training phase required
Works well for small datasets
Can handle non-linear boundaries
Useful for multi-class classification

Limitations

Slow during prediction
Sensitive to irrelevant features
Sensitive to feature scaling
Memory intensive
Choosing correct K is important
Performs poorly with high-dimensional data

Applications

Recommendation systems
Image classification
Pattern recognition
Medical diagnosis
Customer segmentation
Handwriting recognition

Python Implementation

from sklearn.neighbors import KNeighborsClassifier

X = [
    [1, 1],
    [2, 1],
    [1, 2],
    [5, 5],
    [6, 5],
    [5, 6]
]

y = ["Red", "Red", "Red", "Blue", "Blue", "Blue"]

model = KNeighborsClassifier(n_neighbors=3)

model.fit(X, y)

prediction = model.predict([[3, 3]])

print("Prediction:", prediction)

Output:

Prediction: ['Red']

Important Points

KNN stands for K-Nearest Neighbors.
KNN is a supervised learning algorithm.
KNN is used for classification and regression.
It is called a lazy learning algorithm.
KNN stores training data instead of learning parameters.
Prediction is based on nearest neighbors.
K value controls how many neighbors are considered.
Majority voting is used for classification.
KNN can create non-linear decision boundaries.
Feature scaling is very important in KNN.

Keywords

K Nearest Neighbors, KNN Classifier, Non Linear Classifier, Lazy Learning Algorithm, Euclidean Distance, Majority Voting, K Value, Distance Based Algorithm, KNN Decision Boundary, Machine Learning Classification