Naive Bayes Classifier
Naive Bayes is a supervised machine learning algorithm used for classification problems.
It is based on Bayes Theorem, a fundamental concept in probability theory.
The algorithm predicts the class of a data point by calculating probabilities and choosing the class with the highest probability.
In simple words:
Naive Bayes answers the question:
"Given these features, which class is most likely?"
Why is it Called Naive Bayes?
The algorithm assumes that all input features are independent of each other.
For example, suppose we want to predict whether a person buys a car based on:
Age
Income
Education
Naive Bayes assumes:
Age does not affect Income
Income does not affect Education
Education does not affect Age
This assumption is usually not true in real life.
Because of this strong assumption, the algorithm is called:
Naive Bayes
Real-Life Example
Suppose you receive an email.
The email contains words:
Free
Offer
Winner
Prize
You want to classify the email as:
Spam
Not Spam
Naive Bayes calculates:
Probability(Spam | Email)
and
Probability(Not Spam | Email)
Then predicts the class with the higher probability.
Bayes Theorem
Naive Bayes is built on Bayes Theorem.

Where:
| Term | Meaning |
|---|---|
| P(A|B) | Posterior Probability |
| P(B|A) | Likelihood |
| P(A) | Prior Probability |
| P(B) | Evidence |
Understanding the Formula
Suppose:
A = Spam
B = Word "Free"
Then:
P(A)
Probability that an email is Spam.
Example:
40 out of 100 emails are spam
P(Spam) = 40/100 = 0.4
P(B|A)
Probability that a spam email contains the word "Free".
Suppose:
32 spam emails contain "Free"
P(Free∣Spam) = 32/40 = 0.8
P(A|B)
Probability that an email is spam given that it contains the word "Free".
This is what we want to calculate.
Working Example
Suppose we have:
| Email Type | Count |
|---|---|
| Spam | 40 |
| Not Spam | 60 |
Total Emails:
100
Step 1: Calculate Prior Probabilities
Spam
P(Spam)=40/100
Not Spam
P(NotSpam)=60/100
Step 2: Calculate Likelihood
Suppose:
| Word | Spam Emails |
|---|---|
| Free | 32 |
P(Free∣Spam)=32/40
Suppose:
| Word | Not Spam Emails |
|---|---|
| Free | 6 |
P(Free∣NotSpam)=6/60
Step 3: Apply Bayes Theorem
We compare probabilities for each class.
For Spam:
P(Spam∣Free)∝P(Free∣Spam)×P(Spam)
Substitute values:
0.8×0.4
For Not Spam:
P(NotSpam∣Free)∝P(Free∣NotSpam)×P(NotSpam)
0.1×0.6
Step 4: Compare Probabilities
| Class | Probability |
|---|---|
| Spam | 0.32 |
| Not Spam | 0.06 |
Since:
0.32 > 0.06
Prediction:
Spam
Another Example Using Student Data
Suppose we want to predict whether a student will pass an exam.
Training Data:
| Study Hours | Result |
|---|---|
| High | Pass |
| High | Pass |
| Medium | Pass |
| Low | Fail |
| Low | Fail |
Step 1: Prior Probability
Pass:
P(Pass)=3/5=0.6
Fail:
P(Fail)=2/5=0.4
Step 2: Likelihood
For a student with:
Study Hours = High
Likelihood:
P(High∣Pass)=2/3
P(High∣Fail)=0/2
Step 3: Calculate Posterior
Pass:
0.667×0.6
Fail:
0×0.40
Prediction:
Pass
How Naive Bayes Makes Predictions
Training Data
↓
Calculate Prior Probabilities
↓
Calculate Likelihood Probabilities
↓
Apply Bayes Theorem
↓
Compute Posterior Probabilities
↓
Select Highest Probability Class
Why Naive Bayes Works Well
Even though the independence assumption is often incorrect:
Feature Independence Assumption
Naive Bayes still performs surprisingly well because:
-
Probability calculations are simple
-
Less data is required
-
Fast training
-
Fast prediction
Advantages
-
Easy to understand
-
Fast training
-
Fast prediction
-
Works well with small datasets
-
Excellent for text classification
-
Handles high-dimensional data efficiently
Limitations
-
Assumes feature independence
-
Probability estimates may not be accurate
-
Can struggle with highly correlated features
Applications
Spam Detection
Spam / Not Spam
Sentiment Analysis
Positive / Negative
Document Classification
Sports
Politics
Technology
Business
Medical Diagnosis
Disease Prediction
Python Implementation
Example 1: Simple Naive Bayes Classification
from sklearn.naive_bayes import GaussianNB
# Training Data
X = [
[1],
[2],
[3],
[8],
[9],
[10]
]
y = [
"Fail",
"Fail",
"Fail",
"Pass",
"Pass",
"Pass"
]
# Create Model
model = GaussianNB()
# Train Model
model.fit(X, y)
# Predict
prediction = model.predict([[7]])
print("Prediction:", prediction[0])
Output:
Prediction: Pass
Example 2: Email Spam Classification
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
emails = [
"Free prize winner",
"Claim your free offer",
"Meeting at 10 AM",
"Project discussion tomorrow"
]
labels = [
"Spam",
"Spam",
"Not Spam",
"Not Spam"
]
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(emails)
model = MultinomialNB()
model.fit(X, labels)
test_email = vectorizer.transform(
["Free offer available"]
)
prediction = model.predict(test_email)
print(prediction[0])
Output:
Spam
Important Points
-
Naive Bayes is a probabilistic classification algorithm.
-
It is based on Bayes Theorem.
-
It assumes all features are independent.
-
Prediction is based on posterior probabilities.
-
The class with the highest probability is selected.
-
It is extremely fast and memory efficient.
-
Widely used in spam filtering and text classification.
-
It works surprisingly well despite its naive assumption.
Keywords
Naive Bayes Classifier, Bayes Theorem, Probabilistic Classification, Prior Probability, Posterior Probability, Likelihood, Spam Detection, Machine Learning Classification, Bayesian Learning, Feature Independence Assumption