Bayes : Examples

Example 1: Spam Email Detection

Problem Statement

A company has analyzed 1000 emails.

  • 400 emails are Spam

  • 600 emails are Not Spam

The word "Offer" appears:

  • In 280 Spam emails

  • In 120 Not Spam emails

A new email arrives containing the word "Offer".

Predict whether the email is Spam or Not Spam using Bayes Classification.

Step 1: Calculate Prior Probabilities
P(Spam) = 400 / 1000
= 0.4

P(NotSpam) = 600 / 1000
= 0.6

Step 2: Calculate Likelihoods

P(Offer | Spam) = 280 / 400
= 0.7

P(Offer | NotSpam) = 120 / 600
= 0.2

Step 3: Calculate Scores

Spam Score

= P(Offer | Spam) × P(Spam)

= 0.7 × 0.4

= 0.28
NotSpam Score

= P(Offer | NotSpam) × P(NotSpam)

= 0.2 × 0.6

= 0.12

Step 4: Compare Scores

Spam Score    = 0.28

NotSpam Score = 0.12
0.28 > 0.12

Prediction

Spam

Example 2: Disease Diagnosis

Problem Statement

In a hospital:

  • 300 patients have Flu

  • 700 patients do not have Flu

Among Flu patients:

  • 240 have Fever

Among Non-Flu patients:

  • 70 have Fever

A patient arrives with Fever.

Predict whether the patient has Flu.

Step 1: Prior Probabilities

P(Flu) = 300 / 1000
= 0.3

P(NoFlu) = 700 / 1000
= 0.7

Step 2: Likelihoods

P(Fever | Flu) = 240 / 300
= 0.8

P(Fever | NoFlu) = 70 / 700
= 0.1

Step 3: Calculate Scores

Flu Score

= P(Fever | Flu) × P(Flu)

= 0.8 × 0.3

= 0.24
NoFlu Score

= P(Fever | NoFlu) × P(NoFlu)

= 0.1 × 0.7

= 0.07

Step 4: Compare Scores

Flu Score   = 0.24

NoFlu Score = 0.07
0.24 > 0.07

Prediction

Patient has Flu

Example 3: Student Pass Prediction

Problem Statement

A university collected data from 1000 students.

  • 700 students passed

  • 300 students failed

Among students who passed:

  • 630 studied more than 5 hours daily

Among students who failed:

  • 60 studied more than 5 hours daily

A new student studies more than 5 hours daily.

Predict whether the student will Pass or Fail.

Step 1: Prior Probabilities

P(Pass) = 700 / 1000
= 0.7

P(Fail) = 300 / 1000
= 0.3

Step 2: Likelihoods

P(Study | Pass) = 630 / 700
= 0.9

P(Study | Fail) = 60 / 300
= 0.2

Step 3: Calculate Scores

Pass Score

= P(Study | Pass) × P(Pass)

= 0.9 × 0.7

= 0.63
Fail Score

= P(Study | Fail) × P(Fail)

= 0.2 × 0.3

= 0.06

Step 4: Compare Scores

Pass Score = 0.63

Fail Score = 0.06
0.63 > 0.06

Prediction

Pass

Example 4: Weather Prediction

Problem Statement

Historical records show:

  • 400 rainy days

  • 600 non-rainy days

Among rainy days:

  • 300 were cloudy

Among non-rainy days:

  • 180 were cloudy

Today is cloudy.

Predict whether it will rain.

Step 1: Prior Probabilities

P(Rain) = 400 / 1000
= 0.4

P(NoRain) = 600 / 1000
= 0.6

Step 2: Likelihoods

P(Cloudy | Rain) = 300 / 400
= 0.75

P(Cloudy | NoRain) = 180 / 600
= 0.30

Step 3: Calculate Scores

Rain Score

= P(Cloudy | Rain) × P(Rain)

= 0.75 × 0.4

= 0.30
NoRain Score

= P(Cloudy | NoRain) × P(NoRain)

= 0.30 × 0.6

= 0.18

Step 4: Compare Scores

Rain Score   = 0.30

NoRain Score = 0.18
0.30 > 0.18

Prediction

Rain

Example 5: Loan Approval

Problem Statement

A bank has data for 1000 customers.

  • 800 loans approved

  • 200 loans rejected

Among approved customers:

  • 560 have high salaries

Among rejected customers:

  • 40 have high salaries

A new customer has a high salary.

Predict whether the loan should be approved.

Step 1: Prior Probabilities

P(Approve) = 800 / 1000
= 0.8

P(Reject) = 200 / 1000
= 0.2

Step 2: Likelihoods

P(HighSalary | Approve) = 560 / 800
= 0.7

P(HighSalary | Reject) = 40 / 200
= 0.2

Step 3: Calculate Scores

Approve Score

= P(HighSalary | Approve) × P(Approve)

= 0.7 × 0.8

= 0.56
Reject Score

= P(HighSalary | Reject) × P(Reject)

= 0.2 × 0.2

= 0.04

Step 4: Compare Scores

Approve Score = 0.56

Reject Score = 0.04
0.56 > 0.04

Prediction

Loan Approved

Example 6: Naive Bayes with Multiple Features

Problem Statement

An email contains two words:

  • Offer

  • Free

Given:

P(Spam) = 0.4

P(NotSpam) = 0.6

P(Offer | Spam) = 0.7

P(Free | Spam) = 0.8

P(Offer | NotSpam) = 0.2

P(Free | NotSpam) = 0.1

Predict whether the email is Spam.

Step 1: Calculate Spam Score

Spam Score

= P(Spam)
× P(Offer | Spam)
× P(Free | Spam)

= 0.4 × 0.7 × 0.8

= 0.224

Step 2: Calculate NotSpam Score

NotSpam Score

= P(NotSpam)
× P(Offer | NotSpam)
× P(Free | NotSpam)

= 0.6 × 0.2 × 0.1

= 0.012

Step 3: Compare Scores

Spam Score    = 0.224

NotSpam Score = 0.012
0.224 > 0.012

Prediction

Spam

Example 7: Disease Diagnosis — Prediction is No Flu

Problem StatementIn a hospital:
  • 300 patients have Flu
  • 700 patients do not have Flu
Among Flu patients:
  • 60 have Body Pain
Among Non-Flu patients:
  • 350 have Body Pain
A patient arrives with Body Pain.Predict whether the patient has Flu or No Flu. Step 1: Prior Probabilities
P(Flu) = 300 / 1000
= 0.3

P(NoFlu) = 700 / 1000
= 0.7
Step 2: Likelihoods
P(BodyPain | Flu) = 60 / 300
= 0.2

P(BodyPain | NoFlu) = 350 / 700
= 0.5
Step 3: Calculate Scores
Flu Score

= P(BodyPain | Flu) × P(Flu)

= 0.2 × 0.3

= 0.06
NoFlu Score

= P(BodyPain | NoFlu) × P(NoFlu)

= 0.5 × 0.7

= 0.35
Step 4: Compare Scores
Flu Score   = 0.06

NoFlu Score = 0.35
0.35 > 0.06

Prediction

No Flu Even though the patient has Body Pain, Body Pain appears more often among patients who do not have Flu in this dataset.So the Bayes classifier predicts:No Flu.
Previous Topic Naive Bayes Classifier Next Topic Bagging and Boosting