Gaussian Naive Bayes - Machine Learning

In the previous tutorial, we learned that Naive Bayes uses Bayes Theorem to calculate probabilities and classify data.However, one important question remains:

How do we calculate probabilities when the features are continuous numerical values?

For example:

Age = 25
Salary = 50000
Height = 170 cm
Weight = 70 kg

These values are not categories like:

High
Medium
Low

They are continuous numerical values.

This is where Gaussian Naive Bayes comes into the picture.

Why Do We Need Gaussian Naive Bayes?

Suppose we want to classify whether a person is:

Male
Female

based on:

Height
Weight

Training data:

Height	Weight	Class
170	70	Male
180	80	Male
175	75	Male
155	50	Female
160	55	Female
165	60	Female

Now suppose a new person has:

Height = 172
Weight = 72

Question:

What is P(Height=172 | Male)?

Since height is continuous, we cannot simply count occurrences.

This is why Gaussian Naive Bayes assumes that numerical features follow a:

Normal Distribution

also called:

Gaussian Distribution

What is a Gaussian Distribution?

A Gaussian Distribution is the famous bell-shaped curve.

              *
            *   *
          *       *
        *           *
      *               *
----*-------------------*----

Characteristics:

Mean (μ)
Standard Deviation (σ)
Bell-shaped curve
Symmetric distribution

Examples:

Height of people
Weight of people
Exam scores
Blood pressure

often approximately follow a Gaussian distribution.

Core Assumption of Gaussian Naive Bayes

Gaussian Naive Bayes assumes:

Each feature follows a Normal (Gaussian) Distribution within each class.

Example:

Male Heights
      ↓
Gaussian Distribution

Female Heights
      ↓
Gaussian Distribution

Probability Density Function (PDF)

To calculate probabilities, Gaussian Naive Bayes uses the Gaussian formula:

                1
P(x) = ---------------------- × e^(-(x-μ)² / 2σ²)
         √(2πσ²)

This formula gives the probability density of a value.

Important idea:

Values near the mean
      ↓
High Probability

Values far from the mean
      ↓
Low Probability

Example Dataset

Suppose we have:

Height	Class
170	Male
180	Male
175	Male
155	Female
160	Female
165	Female

We want to classify:

Height = 172

Step 1: Calculate Prior Probabilities

Total records:

Male:

Female:

Therefore:

P(Male)=3/6=0.5

P(Female)=3/6=0.5

Step 2: Calculate Mean for Each Class

Male Mean

Male heights:

170
180
175

Mean:

μ = (170+180+175)/3

μ = 525/3

μ = 175

Female Mean

Female heights:

155
160
165

Mean:

μ = (155+160+165)/3

μ = 480/3

μ = 160

Step 3: Calculate Variance

Male heights:

170, 180, 175

Mean:

Variance:

[(170-175)²+(180-175)²+(175-175)²]/3

(25+25+0)/3

50/3

16.67

Female heights:

155,160,165

Mean:

Variance:

[(155-160)²+(160-160)²+(165-160)²]/3

(25+0+25)/3

16.67

Step 4: Calculate Likelihood Using Gaussian Formula

New height:

x = 172

Gaussian probability density formula:

P(x|class) = 1 / √(2πσ²) × e^(-(x-μ)² / 2σ²)

Where:

x  = new value
μ  = mean
σ² = variance
π  = 3.1416
e  = 2.718

For Male Class

From previous steps:

Mean, μ = 175
Variance, σ² = 16.67
x = 172

Substitute into formula:

P(172|Male) = 1 / √(2 × 3.1416 × 16.67)
              × e^(-((172 - 175)² / (2 × 16.67)))

Now simplify:

172 - 175 = -3

(-3)² = 9

2 × 16.67 = 33.34

So exponent part:

-9 / 33.34 = -0.27

Now denominator part:

2 × 3.1416 × 16.67 = 104.74

√104.74 = 10.23

So:

1 / 10.23 = 0.0977

Now exponential value:

e^(-0.27) ≈ 0.763

Therefore:

P(172|Male) = 0.0977 × 0.763

P(172|Male) ≈ 0.0745

So:

P(172|Male) ≈ 0.074

For Female Class

From previous steps:

Mean, μ = 160
Variance, σ² = 16.67
x = 172

Substitute into formula:

P(172|Female) = 1 / √(2 × 3.1416 × 16.67)
                × e^(-((172 - 160)² / (2 × 16.67)))

Now simplify:

172 - 160 = 12

12² = 144

2 × 16.67 = 33.34

So exponent part:

-144 / 33.34 = -4.32

Denominator part is same:

√(2 × 3.1416 × 16.67) = 10.23

So:

1 / 10.23 = 0.0977

Now exponential value:

e^(-4.32) ≈ 0.0133

Therefore:

P(172|Female) = 0.0977 × 0.0133

P(172|Female) ≈ 0.0013

So:

P(172|Female) ≈ 0.0013

Step 5: Apply Bayes Theorem

Prior probabilities:

P(Male) = 0.5
P(Female) = 0.5

We compare:

P(Male|172) ∝ P(172|Male) × P(Male)

P(Female|172) ∝ P(172|Female) × P(Female)

Male Probability

P(Male|172) ∝ 0.0745 × 0.5

P(Male|172) ∝ 0.03725

Female Probability

P(Female|172) ∝ 0.0013 × 0.5

P(Female|172) ∝ 0.00065

Step 6: Compare Probabilities

Class	Probability Score
Male	0.03725
Female	0.00065

Since:

0.03725 > 0.00065

Prediction:

Male

The height value 172 is much closer to the male mean 175 than the female mean 160, so Gaussian Naive Bayes predicts the class as Male.

How Gaussian Naive Bayes Works

Training Data
       ↓
Calculate Mean
       ↓
Calculate Variance
       ↓
Assume Gaussian Distribution
       ↓
Compute Probability Density
       ↓
Apply Bayes Theorem
       ↓
Select Highest Probability Class

Why Mean and Variance Are Important

Gaussian Naive Bayes stores:

Mean (μ)
Variance (σ²)

for every feature in every class.

Example:

Class	Mean Height	Variance
Male	175	16.67
Female	160	16.67

Using these values, it computes probability densities for new observations.

Real-Life Applications

Medical Diagnosis

Age
Blood Pressure
Sugar Level

Credit Risk Prediction

Salary
Loan Amount
Age

Customer Classification

Income
Spending Score

Student Performance Prediction

Study Hours
Attendance
Marks

Python Implementation

from sklearn.naive_bayes import GaussianNB
import numpy as np

# Training Data
X = np.array([
    [170],
    [180],
    [175],
    [155],
    [160],
    [165]
])

y = np.array([
    "Male",
    "Male",
    "Male",
    "Female",
    "Female",
    "Female"
])

# Create Model
model = GaussianNB()

# Train Model
model.fit(X, y)

# Predict
prediction = model.predict([[172]])

print("Prediction:", prediction[0])

Output:

Prediction: Male

Advantages

Very fast training
Works well with numerical data
Requires less training data
Easy to implement
Handles multiple classes

Limitations

Assumes Gaussian distribution
Assumes feature independence
Performance decreases when data is not normally distributed

Important Points

Gaussian Naive Bayes is used for continuous numerical features.
It assumes features follow a Gaussian distribution.
It stores mean and variance for each class.
Probability density is calculated using the Gaussian formula.
Bayes Theorem is then applied to classify new observations.
It works well for small and medium-sized datasets.
Commonly used in medical diagnosis and numerical classification problems.

Keywords

Gaussian Naive Bayes, Normal Distribution, Gaussian Distribution, Probability Density Function, Mean and Variance, Continuous Features, Bayesian Classification, Numerical Data Classification, Machine Learning Classification, Supervised Learning