Bayes Theorem
Before understanding Naive Bayes Classifier, it is very important to understand Bayes Theorem, because Naive Bayes is completely built on this concept.
Why Do We Need Bayes Theorem?
Suppose a doctor tells a patient:
You tested positive for a disease.
The patient's first question will be:
What is the probability that I actually have the disease?
This is exactly the type of question Bayes Theorem answers.
Bayes Theorem helps us find:
The probability of an event after observing some evidence.
Intuition Behind Bayes Theorem
Normally we ask:
If a person has Disease,
what is the probability of testing Positive?
This is:
P(Positive | Disease)
But in reality we often know:
Test Result = Positive
and want to find:
P(Disease | Positive)
Bayes Theorem helps us reverse the probability.
Bayes Theorem Formula
Mathematically:
P(A|B) = ( P(B|A) × P(A) ) / P(B)
Terminology:
A = Person has Disease
B = Test is Positive
Then:
P(A)
Probability of having the disease
before seeing test results.
Called:
Prior Probability
P(B|A)
Probability that test becomes positive
if the person actually has disease.
Called:
Likelihood
P(B)
Overall probability of getting
a positive test result.
Called:
Evidence
P(A|B)
Probability that person has disease
given that the test is positive.
Called:
Posterior Probability
This is what we want to calculate.
Visual Understanding
Prior Knowledge
↓
Observe Evidence
↓
Update Belief
↓
Posterior Probability
Bayes Theorem is simply:
Old Belief + New Evidence = Updated Belief
Example 1: Disease Testing
Suppose:
1% of people have a disease.
Therefore:
P(Disease) = 0.01
The test correctly detects disease 99% of the time.
P(Positive | Disease) = 0.99
The test incorrectly shows positive for healthy people 5% of the time.
P(Positive | No Disease) = 0.05
Step 1: Calculate P(Positive)
Positive results can come from:
People with Disease
+
People without Disease
Therefore:
P(Positive)
=
P(Positive|Disease)P(Disease)
+
P(Positive|NoDisease)P(NoDisease)
Substitute values:
(0.99 × 0.01)
+
(0.05 × 0.99)
0.0099 + 0.0495
P(Positive) = 0.0594
Step 2: Apply Bayes Theorem
We want:
P(Disease | Positive)
Substitute values:
P(Disease|Positive)
=
(0.99 × 0.01)
/
0.0594
0.0099 / 0.0594
= 0.1667
Final Answer
P(Disease|Positive)
= 16.67%
Surprising Result
Many people expect:
99% accurate test
means:
99% chance of disease
But actual probability is:
16.67%
Why?Because the disease is very rare.This is the power of Bayes Theorem.
Example 2: Email Spam Detection
Suppose:
40% emails are Spam
60% emails are Not Spam
Therefore:
P(Spam)=0.4
P(NotSpam)=0.6
Suppose:
80% Spam emails contain word "Free"
P(Free|Spam)=0.8
Suppose:
10% Non-Spam emails contain word "Free"
P(Free|NotSpam)=0.1
Step 1: Calculate P(Free)
P(Free)
=
P(Free|Spam)P(Spam)
+
P(Free|NotSpam)P(NotSpam)
Substitute:
(0.8 × 0.4)
+
(0.1 × 0.6)
0.32 + 0.06
P(Free)=0.38
Step 2: Apply Bayes Theorem
P(Spam|Free)
=
(0.8 × 0.4)
/
0.38
0.32 / 0.38
= 0.842
Final Answer
P(Spam|Free)
= 84.2%
If an email contains the word:
Free
there is:
84.2% probability
that it is spam.
Example 3: Student Pass Prediction
Suppose:
70% students pass
P(Pass)=0.7
Among students who pass:
80% study regularly
P(Study|Pass)=0.8
Among students who fail:
20% study regularly
P(Study|Fail)=0.2
Step 1: Calculate P(Study)
P(Study)
=
P(Study|Pass)P(Pass)
+
P(Study|Fail)P(Fail)
Substitute:
(0.8 × 0.7)
+
(0.2 × 0.3)
0.56 + 0.06
P(Study)=0.62
Step 2: Calculate P(Pass|Study)
P(Pass|Study)
=
(0.8 × 0.7)
/
0.62
0.56 / 0.62
= 0.903
Final Answer
P(Pass|Study)
= 90.3%
A student who studies regularly has approximately:
90% chance of passing.
Why Bayes Theorem is Important in Machine Learning
Many machine learning algorithms work by updating probabilities when new evidence arrives.Examples:
Naive Bayes
Bayesian Networks
Spam Filters
Recommendation Systems
Medical Diagnosis Systems
Naive Bayes Classifier directly uses Bayes Theorem to calculate class probabilities.
Bayes Theorem Workflow
Prior Probability
↓
Observe Evidence
↓
Apply Bayes Theorem
↓
Posterior Probability
↓
Make Decision
Important Points
- Bayes Theorem is used to update probabilities based on new evidence.
- It reverses conditional probabilities.
- Prior probability represents initial belief.
- Likelihood represents probability of evidence.
- Posterior probability is the updated probability.
- Evidence normalizes the result.
- Bayes Theorem is the foundation of Naive Bayes Classifier.
- Widely used in spam detection, medical diagnosis, and machine learning.
Keywords
Bayes Theorem, Conditional Probability, Prior Probability, Posterior Probability, Likelihood, Evidence, Bayesian Learning, Probability Theory, Naive Bayes Foundation, Machine Learning Probability Concepts