Bayes Theorem - Machine Learning

Before understanding Naive Bayes Classifier, it is very important to understand Bayes Theorem, because Naive Bayes is completely built on this concept.

Why Do We Need Bayes Theorem?

Suppose a doctor tells a patient:

You tested positive for a disease.

The patient's first question will be:

What is the probability that I actually have the disease?

This is exactly the type of question Bayes Theorem answers.

Bayes Theorem helps us find:

The probability of an event after observing some evidence.

Intuition Behind Bayes Theorem

Normally we ask:

If a person has Disease,
what is the probability of testing Positive?

This is:

P(Positive | Disease)

But in reality we often know:

Test Result = Positive

and want to find:

P(Disease | Positive)

Bayes Theorem helps us reverse the probability.

Bayes Theorem Formula

Mathematically:

P(A|B) = ( P(B|A) × P(A) ) / P(B)

Terminology:

A = Person has Disease

B = Test is Positive

Then:

P(A)

Probability of having the disease
before seeing test results.

Called:

Prior Probability

P(B|A)

Probability that test becomes positive
if the person actually has disease.

Called:

Likelihood

P(B)

Overall probability of getting
a positive test result.

Called:

Evidence

P(A|B)

Probability that person has disease
given that the test is positive.

Called:

Posterior Probability

This is what we want to calculate.

Visual Understanding

Prior Knowledge
      ↓
Observe Evidence
      ↓
Update Belief
      ↓
Posterior Probability

Bayes Theorem is simply:

Old Belief  +  New Evidence = Updated Belief

Example 1: Disease Testing

Suppose:

1% of people have a disease.

Therefore:

P(Disease) = 0.01

The test correctly detects disease 99% of the time.

P(Positive | Disease) = 0.99

The test incorrectly shows positive for healthy people 5% of the time.

P(Positive | No Disease) = 0.05

Step 1: Calculate P(Positive)

Positive results can come from:

People with Disease
+
People without Disease

Therefore:

P(Positive)

=

P(Positive|Disease)P(Disease)

+

P(Positive|NoDisease)P(NoDisease)

Substitute values:

(0.99 × 0.01)

+

(0.05 × 0.99)

0.0099 + 0.0495

P(Positive) = 0.0594

Step 2: Apply Bayes Theorem

We want:

P(Disease | Positive)

Substitute values:

P(Disease|Positive)

=

(0.99 × 0.01)

/

0.0594

0.0099 / 0.0594

= 0.1667

Final Answer

P(Disease|Positive)

= 16.67%

Surprising Result

Many people expect:

99% accurate test

means:

99% chance of disease

But actual probability is:

16.67%

Why?Because the disease is very rare.This is the power of Bayes Theorem.

Example 2: Email Spam Detection

Suppose:

40% emails are Spam
60% emails are Not Spam

Therefore:

P(Spam)=0.4

P(NotSpam)=0.6

Suppose:

80% Spam emails contain word "Free"

P(Free|Spam)=0.8

Suppose:

10% Non-Spam emails contain word "Free"

P(Free|NotSpam)=0.1

Step 1: Calculate P(Free)

P(Free)

=

P(Free|Spam)P(Spam)

+

P(Free|NotSpam)P(NotSpam)

Substitute:

(0.8 × 0.4)

+

(0.1 × 0.6)

0.32 + 0.06

P(Free)=0.38

Step 2: Apply Bayes Theorem

P(Spam|Free)

=

(0.8 × 0.4)

/

0.38

0.32 / 0.38

= 0.842

Final Answer

P(Spam|Free)

= 84.2%

If an email contains the word:

Free

there is:

84.2% probability

that it is spam.

Example 3: Student Pass Prediction

Suppose:

70% students pass

P(Pass)=0.7

Among students who pass:

80% study regularly

P(Study|Pass)=0.8

Among students who fail:

20% study regularly

P(Study|Fail)=0.2

Step 1: Calculate P(Study)

P(Study)

=

P(Study|Pass)P(Pass)

+

P(Study|Fail)P(Fail)

Substitute:

(0.8 × 0.7)

+

(0.2 × 0.3)

0.56 + 0.06

P(Study)=0.62

Step 2: Calculate P(Pass|Study)

P(Pass|Study)

=

(0.8 × 0.7)

/

0.62

0.56 / 0.62

= 0.903

Final Answer

P(Pass|Study)

= 90.3%

A student who studies regularly has approximately:

90% chance of passing.

Why Bayes Theorem is Important in Machine Learning

Many machine learning algorithms work by updating probabilities when new evidence arrives.Examples:

Naive Bayes
Bayesian Networks
Spam Filters
Recommendation Systems
Medical Diagnosis Systems

Naive Bayes Classifier directly uses Bayes Theorem to calculate class probabilities.

Bayes Theorem Workflow

Prior Probability
      ↓
Observe Evidence
      ↓
Apply Bayes Theorem
      ↓
Posterior Probability
      ↓
Make Decision

Important Points

Bayes Theorem is used to update probabilities based on new evidence.
It reverses conditional probabilities.
Prior probability represents initial belief.
Likelihood represents probability of evidence.
Posterior probability is the updated probability.
Evidence normalizes the result.
Bayes Theorem is the foundation of Naive Bayes Classifier.
Widely used in spam detection, medical diagnosis, and machine learning.

Keywords

Bayes Theorem, Conditional Probability, Prior Probability, Posterior Probability, Likelihood, Evidence, Bayesian Learning, Probability Theory, Naive Bayes Foundation, Machine Learning Probability Concepts